DAOS-15960 tests: Improvements for io_sys_admin test #14503

saurabhtandan · 2024-06-04T01:29:44Z

Before requesting gatekeeper:

Two review approvals and any prior change requests have been resolved.
Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
Commit messages follows the guidelines outlined here.
Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin Signed-off-by: Saurabh Tandan <[email protected]>

github-actions · 2024-06-04T01:30:00Z

Ticket title is 'io_sys_admin improvements'
Status is 'In Progress'
Labels: 'scrubbed_2.8,tds'
https://daosio.atlassian.net/browse/DAOS-15960

daosbuild1 · 2024-06-04T02:21:58Z

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14503/1/testReport/

daltonbohning

there are flake and pylint errors

daltonbohning · 2024-06-04T14:26:27Z

src/tests/ftest/util/mdtest_test_base.py

+    def run_mdtest(self, manager, processes, display_space=True, pool=None, out_queue=None,
+                   intercept=None):


We shouldn't add another option for this. The test itself could just do

mdtest_cmd.env.update(LD_PRELOAD=intercept, D_IL_REPORT='1')

daltonbohning · 2024-06-04T14:28:04Z

src/tests/ftest/deployment/io_sys_admin.yaml

+    - [SX, EC_2P1GX]
+    - [S1, EC_2P1G1]


Can we please split this into ior_object_class and mdtest_object_class so the config and logic is more clear?
And could we leave a comment saying something like file oclass, dir oclass?
Also, for mdtest we can use *X for the directory

Also, do we really intend to use SX/S1 for files and EC for directories?

Or is this not file, dir? I think splitting into ior_object_class and mdtest_object_class will make this easier to understand

daltonbohning · 2024-06-04T16:17:51Z

src/tests/ftest/util/file_count_test_base.py

+        for idx, _ in enumerate(object_class):
+            self.ior_cmd.dfs_oclass.update(ior_oclass[idx])
+            self.mdtest_cmd.dfs_oclass.update(mdtest_oclass[idx])
+            self.ior_cmd.dfs_dir_oclass.update(ior_oclass[idx])


This is enumerating object_class but then using the index like ior_oclass[idx] which I think is confusing. What if object_class is longer than ior_oclass?

daosbuild1 · 2024-06-05T14:33:47Z

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14503/1/execution/node/1421/log

daltonbohning · 2024-06-11T14:53:26Z

src/tests/ftest/util/file_count_test_base.py

+        if dir_oclass:
+            container.dir_oclass.update(dir_oclass)


This might be bad if we do, e.g., container create --oclass S1 because then the DFS root object will be S1. Which means ALL client ranks will read root from a single server target.

We should instead do

container create --file-oclass S1

If you just want to change the file oclass

Signed-off-by: Saurabh Tandan <[email protected]>

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

daltonbohning

There are some lint errors that need to be resolved before merging

daltonbohning · 2024-08-05T21:13:44Z

src/tests/ftest/util/file_count_test_base.py

+            file_oclass: file object class of container
+            dir_oclass: dir object class of container


Suggested change

file_oclass: file object class of container

dir_oclass: dir object class of container

file_oclass (str, optional): file object class of container. Defaults to None

dir_oclass (str, optional): dir object class of container. Defaults to None

daltonbohning · 2024-08-05T21:15:18Z

src/tests/ftest/util/file_count_test_base.py

+        """
+        rd_fac: redundancy factor
+
+        Returns: value for dir_oclass
+        """


Suggested change

"""

rd_fac: redundancy factor

Returns: value for dir_oclass

"""

"""

Args

rd_fac (int): redundancy factor

Returns:

str: value for dir_oclass

"""

daltonbohning · 2024-08-05T21:19:54Z

src/tests/ftest/util/file_count_test_base.py

                    try:
                        self.processes = mdtest_np
                        self.ppn = mdtest_ppn
-                        self.execute_mdtest()
+                        if self.mdtest_cmd.api.value == 'POSIX':
+                            self.mdtest_cmd.env.update(LD_PRELOAD=intercept, D_IL_REPORT='1')


This is going to force all tests using this test base to use interception. Is that really what we want?

Yeah that's what we want as of now.

daltonbohning · 2024-08-05T21:20:32Z

src/tests/ftest/util/file_count_test_base.py

@@ -95,6 +117,8 @@ def run_file_count(self):
                        self.ior_cmd.api.update('HDF5')
                        self.run_ior_with_pool(
                            create_pool=False, plugin_path=hdf5_plugin_path, mount_dir=mount_dir)
+                    elif self.ior_cmd.api.value == 'POSIX':
+                        self.run_ior_with_pool(create_pool=False, intercept=intercept)


Similar - this is going to force all tests using this test base to use interception

Yeah that's what we want as of now.

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

daosbuild1 · 2024-08-06T11:17:54Z

Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14503/4/testReport/

daltonbohning · 2024-08-06T13:44:51Z

src/tests/ftest/util/file_count_test_base.py

+            if api == "DFS":
+                self.mdtest_cmd.test_dir.update("/")
+            if self.mdtest_cmd.api.value in ['DFS', 'POSIX']:
+                for oclass in mdtest_oclass:


Maybe it would be better if the config did this?

mdtest_oclass: # file, dir - [S1, SX] - [EC_2P1G1, RP_2GX]

And then the code could do

for file_oclass, dir_oclass in mdtest_oclass:

Which would

Allow the dir oclass to be configurable without modifying the code

Make it easier to understand what is being ran when looking at the config. Right now you have to dig into the code because it's hardcoded

As we discussed, let's ignore this for now since you will be OOO soon

daltonbohning

This introduces a test failure
https://build.hpdd.intel.com/blue/organizations/jenkins/daos-stack%2Fdaos/detail/PR-14503/4/tests

Because this function is hardcoded to use -F

daos/src/tests/ftest/util/data_mover_test_base.py

Lines 1030 to 1032 in 8896868

    
           self.run_ior_with_params( 
        
               "DAOS", daos_path, read_back_pool, read_back_cont, 
        
               flags="-r -R -F -k")

But this PR removed -F

daos/src/tests/ftest/io/small_file_count.yaml

Line 43 in 9afa882

flags: "-v -D 300 -W -w -r -R"

One solution is to update that function to be dynamic instead of hardocded

# Original flags used for write
flags = self.ior_cmd.flags.value

# Remove read and write from flags if present
flags = re.sub(" *-r", "", flags)
flags = re.sub(" *-R", "", flags)
flags = re.sub(" *-w", "", flags)
flags = re.sub(" *-W", "", flags)

# Remove stonewall
flags = re.sub(" *-D [0-9]+", "", flags)

# Add read flags
flags += " -r -R"


self.run_ior_with_params(
    "DAOS", daos_path, read_back_pool, read_back_cont,
    flags=flags)

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>

daltonbohning

LGTM assuming tests pass

Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>

Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Signed-off-by: Saurabh Tandan <[email protected]>

saurabhtandan added 2 commits June 4, 2024 01:22

DAOS-15960 tests: Improvements for io_sys_admin test

c0815fa

Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin Signed-off-by: Saurabh Tandan <[email protected]>

DAOS-15960 tests: Improvements for io_sys_admin test

5e3b4e9

Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin Signed-off-by: Saurabh Tandan <[email protected]>

saurabhtandan changed the title ~~Standan/daos 15960~~ DAOS-15960 tests: Improvements for io_sys_admin test Jun 4, 2024

saurabhtandan requested a review from daltonbohning June 4, 2024 01:30

daltonbohning reviewed Jun 4, 2024

View reviewed changes

daltonbohning requested changes Jun 11, 2024

View reviewed changes

saurabhtandan added 2 commits July 22, 2024 22:00

Merge branch 'master' into standan/DAOS-15960

02a4f02

COmmit changes so far

7f03376

Signed-off-by: Saurabh Tandan <[email protected]>

shimizukko mentioned this pull request Aug 1, 2024

DAOS-15960 test: Makito's version of io_sys_admin test improvement #14858

Draft

18 tasks

saurabhtandan added 3 commits August 5, 2024 20:58

Modified file_count_test_base.py structure

1b41263

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

Merge branch 'master' into standan/DAOS-15960

8515644

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount

Cleaned up some files

a66bdbd

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

daltonbohning reviewed Aug 5, 2024

View reviewed changes

Incorporated review suggestions

9afa882

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>

daltonbohning approved these changes Aug 6, 2024

View reviewed changes

daltonbohning requested changes Aug 7, 2024

View reviewed changes

Fix for a CI failure due to changes

c7e88ba

Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>

saurabhtandan force-pushed the standan/DAOS-15960 branch from e035dea to c7e88ba Compare August 7, 2024 20:26

daltonbohning reviewed Aug 8, 2024

View reviewed changes

saurabhtandan requested review from daltonbohning, shimizukko and mjean308 August 8, 2024 16:38

daltonbohning approved these changes Aug 8, 2024

View reviewed changes

mjean308 approved these changes Aug 8, 2024

View reviewed changes

daltonbohning marked this pull request as ready for review August 8, 2024 18:53

daltonbohning requested review from a team as code owners August 8, 2024 18:53

daltonbohning added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Aug 8, 2024

daltonbohning requested a review from a team August 8, 2024 18:58

daltonbohning merged commit a755552 into master Aug 8, 2024
55 checks passed

daltonbohning deleted the standan/DAOS-15960 branch August 8, 2024 18:58

cdavis28 mentioned this pull request Aug 12, 2024

Merge upstream/release/2.6 into upstream/google/2.6 #14916

Merged

mjmac mentioned this pull request Nov 13, 2024

mjmac/DAOS 16787 google 2.6 #15498

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAOS-15960 tests: Improvements for io_sys_admin test #14503

DAOS-15960 tests: Improvements for io_sys_admin test #14503

saurabhtandan commented Jun 4, 2024

github-actions bot commented Jun 4, 2024 •

edited

Loading

daosbuild1 commented Jun 4, 2024

daltonbohning left a comment

daltonbohning Jun 4, 2024

daltonbohning Jun 4, 2024

daltonbohning Jun 4, 2024

daltonbohning Jun 4, 2024

daltonbohning Jun 4, 2024

daosbuild1 commented Jun 5, 2024

daltonbohning Jun 11, 2024

daltonbohning left a comment

daltonbohning Aug 5, 2024

daltonbohning Aug 5, 2024

daltonbohning Aug 5, 2024

saurabhtandan Aug 5, 2024

daltonbohning Aug 5, 2024

saurabhtandan Aug 5, 2024

daosbuild1 commented Aug 6, 2024

daltonbohning Aug 6, 2024

daltonbohning Aug 8, 2024

daltonbohning left a comment

daltonbohning left a comment

		def run_mdtest(self, manager, processes, display_space=True, pool=None, out_queue=None,
		intercept=None):

		file_oclass: file object class of container
		dir_oclass: dir object class of container

	self.run_ior_with_params(
	"DAOS", daos_path, read_back_pool, read_back_cont,
	flags="-r -R -F -k")

DAOS-15960 tests: Improvements for io_sys_admin test #14503

DAOS-15960 tests: Improvements for io_sys_admin test #14503

Conversation

saurabhtandan commented Jun 4, 2024

Before requesting gatekeeper:

Gatekeeper:

github-actions bot commented Jun 4, 2024 • edited Loading

daosbuild1 commented Jun 4, 2024

daltonbohning left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daosbuild1 commented Jun 5, 2024

Choose a reason for hiding this comment

daltonbohning left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daosbuild1 commented Aug 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daltonbohning left a comment

Choose a reason for hiding this comment

daltonbohning left a comment

Choose a reason for hiding this comment

github-actions bot commented Jun 4, 2024 •

edited

Loading