-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-15960 tests: Improvements for io_sys_admin test #14503
Conversation
Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin Signed-off-by: Saurabh Tandan <[email protected]>
Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin Signed-off-by: Saurabh Tandan <[email protected]>
Ticket title is 'io_sys_admin improvements' |
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14503/1/testReport/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are flake and pylint errors
def run_mdtest(self, manager, processes, display_space=True, pool=None, out_queue=None, | ||
intercept=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't add another option for this. The test itself could just do
mdtest_cmd.env.update(LD_PRELOAD=intercept, D_IL_REPORT='1')
- [SX, EC_2P1GX] | ||
- [S1, EC_2P1G1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please split this into ior_object_class and mdtest_object_class so the config and logic is more clear?
And could we leave a comment saying something like file oclass, dir oclass
?
Also, for mdtest we can use *X
for the directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do we really intend to use SX/S1 for files and EC for directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this not file, dir? I think splitting into ior_object_class and mdtest_object_class will make this easier to understand
for idx, _ in enumerate(object_class): | ||
self.ior_cmd.dfs_oclass.update(ior_oclass[idx]) | ||
self.mdtest_cmd.dfs_oclass.update(mdtest_oclass[idx]) | ||
self.ior_cmd.dfs_dir_oclass.update(ior_oclass[idx]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is enumerating object_class
but then using the index like ior_oclass[idx]
which I think is confusing. What if object_class
is longer than ior_oclass
?
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14503/1/execution/node/1421/log |
if dir_oclass: | ||
container.dir_oclass.update(dir_oclass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be bad if we do, e.g., container create --oclass S1
because then the DFS root object will be S1
. Which means ALL client ranks will read root from a single server target.
We should instead do
container create --file-oclass S1
If you just want to change the file oclass
Signed-off-by: Saurabh Tandan <[email protected]>
Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>
Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount
Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some lint errors that need to be resolved before merging
file_oclass: file object class of container | ||
dir_oclass: dir object class of container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file_oclass: file object class of container | |
dir_oclass: dir object class of container | |
file_oclass (str, optional): file object class of container. Defaults to None | |
dir_oclass (str, optional): dir object class of container. Defaults to None |
""" | ||
rd_fac: redundancy factor | ||
|
||
Returns: value for dir_oclass | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
rd_fac: redundancy factor | |
Returns: value for dir_oclass | |
""" | |
""" | |
Args | |
rd_fac (int): redundancy factor | |
Returns: | |
str: value for dir_oclass | |
""" |
try: | ||
self.processes = mdtest_np | ||
self.ppn = mdtest_ppn | ||
self.execute_mdtest() | ||
if self.mdtest_cmd.api.value == 'POSIX': | ||
self.mdtest_cmd.env.update(LD_PRELOAD=intercept, D_IL_REPORT='1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to force all tests using this test base to use interception. Is that really what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what we want as of now.
@@ -95,6 +117,8 @@ def run_file_count(self): | |||
self.ior_cmd.api.update('HDF5') | |||
self.run_ior_with_pool( | |||
create_pool=False, plugin_path=hdf5_plugin_path, mount_dir=mount_dir) | |||
elif self.ior_cmd.api.value == 'POSIX': | |||
self.run_ior_with_pool(create_pool=False, intercept=intercept) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar - this is going to force all tests using this test base to use interception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what we want as of now.
Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount Signed-off-by: Saurabh Tandan <[email protected]>
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14503/4/testReport/ |
if api == "DFS": | ||
self.mdtest_cmd.test_dir.update("/") | ||
if self.mdtest_cmd.api.value in ['DFS', 'POSIX']: | ||
for oclass in mdtest_oclass: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better if the config did this?
mdtest_oclass:
# file, dir
- [S1, SX]
- [EC_2P1G1, RP_2GX]
And then the code could do
for file_oclass, dir_oclass in mdtest_oclass:
Which would
- Allow the dir oclass to be configurable without modifying the code
- Make it easier to understand what is being ran when looking at the config. Right now you have to dig into the code because it's hardcoded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed, let's ignore this for now since you will be OOO soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces a test failure
https://build.hpdd.intel.com/blue/organizations/jenkins/daos-stack%2Fdaos/detail/PR-14503/4/tests
Because this function is hardcoded to use -F
daos/src/tests/ftest/util/data_mover_test_base.py
Lines 1030 to 1032 in 8896868
self.run_ior_with_params( | |
"DAOS", daos_path, read_back_pool, read_back_cont, | |
flags="-r -R -F -k") |
But this PR removed -F
flags: "-v -D 300 -W -w -r -R" |
One solution is to update that function to be dynamic instead of hardocded
# Original flags used for write
flags = self.ior_cmd.flags.value
# Remove read and write from flags if present
flags = re.sub(" *-r", "", flags)
flags = re.sub(" *-R", "", flags)
flags = re.sub(" *-w", "", flags)
flags = re.sub(" *-W", "", flags)
# Remove stonewall
flags = re.sub(" *-D [0-9]+", "", flags)
# Add read flags
flags += " -r -R"
self.run_ior_with_params(
"DAOS", daos_path, read_back_pool, read_back_cont,
flags=flags)
Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>
e035dea
to
c7e88ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming tests pass
Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>
Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Test-tag: pr test_io_sys_admin test_largefilecount test_smallfilecount test_basic_checkout_dm Signed-off-by: Saurabh Tandan <[email protected]>
Use different set of oclasses for ior and mdtest. Add intercept option in mdtest_test_base. Use interception for POSIX runs in the test. Signed-off-by: Saurabh Tandan <[email protected]>
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: