Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream/release/2.6 into upstream/google/2.6 #15265

Closed
wants to merge 13 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion TAG
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.6.1-rc2
2.6.1-rc3
6 changes: 6 additions & 0 deletions debian/changelog
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
daos (2.6.1-3) unstable; urgency=medium
[ Phillip Henderson ]
* Third release candidate for 2.6.1

-- Phillip Henderson <[email protected]> Tue, 01 Oct 2024 14:23:00 -0500

daos (2.6.1-2) unstable; urgency=medium
[ Phillip Henderson ]
* Second release candidate for 2.6.1
Expand Down
51 changes: 51 additions & 0 deletions docs/release/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,57 @@

We are pleased to announce the release of DAOS version 2.6.

## DAOS Version 2.6.1 (2024-10-05)

The DAOS 2.6.1 release contains the following updates on top of DAOS 2.6.0:

* Mercury update for slingshot 11.0 host stack and other UCX provider fixes.

### Bug fixes and improvements

The DAOS 2.6.1 release includes fixes for several defects and a few changes
of administrator interface that can improve usability of DAOS system.

* Fix a race between MS replica stepping up as leader and engines joining the
system, this race may cause engine join to fail.

* Fix a race in concurrent container destroy which may cause engine crash.

* Pool destroy returns explicit error instead of success if there is an
in-progress destroy against the same pool.

* EC aggregation may cause inconsistency between data shard and parity shard,
this has been fixed in DAOS Version 2.6.1.

* Enable pool list for clients.

* Running "daos|dmg pool query-targets" with rank argument can query all
targets on that rank.

* Add daos health check command which allows basic system health checks from client.

* DAOS Version 2.6.0 always excludes unreachable engines reported by SWIM and schedule rebuild for
excluded engines, this is an overreaction if massive engines are impacted by power failure or
switch reboot because data recovery is impossible in these cases. DAOS 2.6.1 introduces a new
environment variable to set in the server yaml file for each engine (DAOS_POOL_RF) to indicate the
number of engine failures seen before stopping the changing of pool membership and completing in
progress rebuild. It will just let all I/O and on-going rebuild block. DAOS system can finish in
progress rebuild and be available again after bringing back impacted engines. The recommendation
is to set this environment variable to 2.

* In DAOS Version 2.6.0, accessing faulty NVMe device returns wrong error code
to DAOS client which can fail the application. DAOS 2.6.1 returns correct
error code to DAOS client so the client can retry and eventually access data
in degraded mode instead of failing the I/O.

* Pil4dfs fix to avoid deadlock with level zero library on aurora and support
for more libc functions that were not intercepted before

For details, please refer to the Github
[release/2.6 commit history](https://github.com/daos-stack/daos/commits/release/2.6)
and the associated [Jira tickets](https://jira.daos.io/) as stated in the commit messages.


## DAOS Version 2.6.0 (2024-07-26)

### General Support
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/copy_procs.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2022 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.
SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -46,7 +46,7 @@ def test_copy_procs(self):
:avocado: tags=DmvrCopyProcs,test_copy_procs
"""
# Create pool and containers
pool1 = self.create_pool()
pool1 = self.get_pool()
cont1 = self.get_container(pool1)
cont2 = self.get_container(pool1)

Expand Down
6 changes: 2 additions & 4 deletions src/tests/ftest/datamover/dst_create.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ def run_dm_dst_create(self, tool, cont_type, api, check_props):
self.set_api(api)

# Create 1 pool
pool1 = self.create_pool()
pool1.connect(2)
pool1 = self.get_pool()

# Create a source cont
cont1 = self.get_container(pool1, type=cont_type)
Expand Down Expand Up @@ -98,8 +97,7 @@ def run_dm_dst_create(self, tool, cont_type, api, check_props):
self.verify_cont(cont3, api, check_props, src_props)

# Create another pool
pool2 = self.create_pool()
pool2.connect(2)
pool2 = self.get_pool()

result = self.run_datamover(
self.test_id + " cont1 to cont4 (different pool) (empty cont)",
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/large_dir.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2022 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -46,7 +46,7 @@ def run_dm_large_dir(self, tool):
file_size = self.params.get("bytes", self.mdtest_cmd.namespace)

# create pool and cont1
pool = self.create_pool()
pool = self.get_pool()
cont1 = self.get_container(pool)

# run mdtest to create data in cont1
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/large_file.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2023 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -43,7 +43,7 @@ def run_dm_large_file(self, tool):
self.fail("Failed to get ior processes for {}".format(self.tool))

# create pool and cont
pool = self.create_pool()
pool = self.get_pool()
cont1 = self.get_container(pool)

# create initial data in cont1
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/negative.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def test_dm_bad_params_dcp(self):
start_dfuse(self, dfuse)

# Create a test pool
pool1 = self.create_pool()
pool1 = self.get_pool()

# Create a special container to hold UNS entries
uns_cont = self.get_container(pool1)
Expand Down Expand Up @@ -215,7 +215,7 @@ def test_dm_bad_params_fs_copy(self):
start_dfuse(self, dfuse)

# Create a test pool
pool1 = self.create_pool()
pool1 = self.get_pool()

# Create a special container to hold UNS entries
uns_cont = self.get_container(pool1)
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/obj_large_posix.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2022 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -37,7 +37,7 @@ def run_dm_obj_large_posix(self, tool):
file_size = self.params.get("bytes", "/run/mdtest/*")

# Create pool1 and cont1
pool1 = self.create_pool()
pool1 = self.get_pool()
cont1 = self.get_container(pool1)

# Create a large directory in cont1
Expand Down
8 changes: 3 additions & 5 deletions src/tests/ftest/datamover/obj_small.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2023 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -58,8 +58,7 @@ def run_dm_obj_small(self, tool):
self.set_tool(tool)

# Create pool1
pool1 = self.create_pool()
pool1.connect(2)
pool1 = self.get_pool()

# Create cont1
cont1 = self.get_container(pool1)
Expand All @@ -85,8 +84,7 @@ def run_dm_obj_small(self, tool):
self.num_akeys_array, self.akey_sizes, self.akey_extents)

# Create pool2
pool2 = self.create_pool()
pool2.connect(2)
pool2 = self.get_pool()

# Clone cont1 to a new cont3 in pool2
result = self.run_datamover(
Expand Down
2 changes: 1 addition & 1 deletion src/tests/ftest/datamover/posix_meta_entry.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def run_dm_posix_meta_entry(self, tool):
start_dfuse(self, dfuse)

# Create 1 pool
pool1 = self.create_pool()
pool1 = self.get_pool()

# Create 1 source container with test data
cont1 = self.get_container(pool1)
Expand Down
5 changes: 2 additions & 3 deletions src/tests/ftest/datamover/posix_preserve_props.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2023 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -56,8 +56,7 @@ def run_dm_preserve_props(self, tool, cont_type, api):
self.set_api(api)

# Create 1 pool
pool1 = self.create_pool()
pool1.connect(2)
pool1 = self.get_pool()

# set the path to read and write container properties
self.preserve_props_path = join(self.tmp, "cont_props.h5")
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/datamover/posix_subsets.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2023 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -54,7 +54,7 @@ def run_dm_posix_subsets(self, tool):
start_dfuse(self, dfuse)

# Create 1 pool
pool1 = self.create_pool()
pool1 = self.get_pool()

# create dfuse containers to test copying to dfuse subdirectories
dfuse_cont1 = self.get_container(pool1)
Expand Down
2 changes: 1 addition & 1 deletion src/tests/ftest/datamover/posix_symlinks.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def run_dm_posix_symlinks(self, tool):
start_dfuse(self, dfuse)

# Create 1 pool
pool1 = self.create_pool()
pool1 = self.get_pool()

# Create a special container to hold UNS entries
uns_cont = self.get_container(pool1)
Expand Down
6 changes: 3 additions & 3 deletions src/tests/ftest/datamover/posix_types.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2023 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.
SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -68,8 +68,8 @@ def run_dm_posix_types(self, tool):
start_dfuse(self, dfuse)

# Create 2 pools
pool1 = self.create_pool(label='pool1')
pool2 = self.create_pool(label='pool2')
pool1 = self.get_pool(label='pool1')
pool2 = self.get_pool(label='pool2')

# Create a special container to hold UNS entries
uns_cont = self.get_container(pool1)
Expand Down
6 changes: 3 additions & 3 deletions src/tests/ftest/datamover/serial_large_posix.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2022 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -43,15 +43,15 @@ def run_dm_serial_large_posix(self, tool):
file_size = self.params.get("bytes", "/run/mdtest/*")

# Create pool1 and cont1
pool1 = self.create_pool()
pool1 = self.get_pool()
cont1 = self.get_container(pool1)

# Create a large directory in cont1
self.mdtest_cmd.write_bytes.update(file_size)
self.run_mdtest_with_params("DAOS", "/", pool1, cont1, flags=mdtest_flags[0])

# Create pool2
pool2 = self.create_pool()
pool2 = self.get_pool()

# Use dfuse as a shared intermediate for serialize + deserialize
dfuse_cont = self.get_container(pool1)
Expand Down
8 changes: 3 additions & 5 deletions src/tests/ftest/datamover/serial_small.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
'''
(C) Copyright 2020-2022 Intel Corporation.
(C) Copyright 2020-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
'''
Expand Down Expand Up @@ -56,8 +56,7 @@ def run_dm_serial_small(self, tool):
self.set_tool(tool)

# Create pool1
pool1 = self.create_pool()
pool1.connect(2)
pool1 = self.get_pool()

# Create cont1
cont1 = self.get_container(pool1)
Expand All @@ -69,8 +68,7 @@ def run_dm_serial_small(self, tool):
self.num_akeys_array, self.akey_sizes, self.akey_extents)

# Create pool2
pool2 = self.create_pool()
pool2.connect(2)
pool2 = self.get_pool()

# Serialize/Deserialize cont1 to a new cont2 in pool2
result = self.run_datamover(
Expand Down
4 changes: 2 additions & 2 deletions src/tests/ftest/deployment/basic_checkout.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
(C) Copyright 2018-2023 Intel Corporation.
(C) Copyright 2018-2024 Intel Corporation.

SPDX-License-Identifier: BSD-2-Clause-Patent
"""
Expand Down Expand Up @@ -120,7 +120,7 @@ def test_basic_checkout_dm(self):
self.ior_ppn = self.ppn

# create pool and container
pool = self.create_pool()
pool = self.get_pool()
cont = self.get_container(pool, oclass=self.ior_cmd.dfs_oclass.value)

# run datamover
Expand Down
2 changes: 1 addition & 1 deletion src/tests/ftest/deployment/basic_checkout.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ mdtest_easy: &mdtest_easy_base
write_bytes: 0
num_of_files_dirs: 100000000
stonewall_timer: 30
stonewall_statusfile: "/var/tmp/daos_testing/stoneWallingStatusFile"
stonewall_statusfile: stoneWallingStatusFile
dfs_destroy: false
mdtest_dfs_s1:
<<: *mdtest_easy_base
Expand Down
2 changes: 0 additions & 2 deletions src/tests/ftest/deployment/disk_failure.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,6 @@ def test_disk_failure_w_rf(self):
Test disk failures during the IO operation.

:avocado: tags=all,manual
:avocado: tags=hw,medium
:avocado: tags=deployment,disk_failure
:avocado: tags=DiskFailureTest,test_disk_failure_w_rf
"""
Expand All @@ -131,7 +130,6 @@ def test_disk_fault_to_normal(self):
Test a disk inducing faults and resetting is back to normal state.

:avocado: tags=all,manual
:avocado: tags=hw,medium
:avocado: tags=deployment,disk_failure
:avocado: tags=DiskFailureTest,test_disk_fault_to_normal
"""
Expand Down
Loading
Loading