-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid using subprocess.run() in FSURDATMODIFYCTSM #2125
Conversation
For tests that invoke cmds_to_setup_conda(), manually calling the script invoking that function (e.g., case.build for FSURDATMODIFYCTSM) could fail if doing so with a conda environment already activated. The problem is that conda run -n ctsm_pylib seems to not actually use ctsm_pylib if, for instance the conda base environment is active. Instead doing CONDA_PREFIX= conda run -n ctsm_pylib seems to work.
This avoids using subprocess.run(), which should hopefully reduce issues related to user environment. However, it does require that all the Python dependencies are loaded. This can be accomplished by activating the ctsm_pylib environment before calling run_sys_tests or cime/scripts/create_test.
Before I continue down this path, I want to make sure FSURDATMODIFYCTSM is working for everyone. It does work for me, but I didn't experience either of the previous issues. @adrifoster @glemieux Would one of y'all be willing to test this to see if you hit issue #2109? Specifically, what we need to see is that Note that you must have all the requisite Python dependencies loaded. To ensure this, please test with the |
…sts. Specifically, FSURDATMODIFYCTSM and RXCROPMATURITY.
I tried it again and the build phase works! However, it failed at the run phase, because numpy wasn't loaded. This is because the run phase is sent to the share queue as a separate thing, so it probably needs to load the conda env again. I also tried running case.submit with --no-batch, but that didn't work either. I think it might still spawn off a separate process even in that case, although it doesn't go into the queue. I'm actually also suspicious that there is a bug in cime for no-batch. I also tried Derecho, but that had multiple problems at this point. My case for Cheyenne is here: /glade/scratch/erik/tests_0901-095751ch/FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel.0901-095751ch |
Unfortunately I'm seeing this as well. I guess the run phase re-imports I've unchecked "works for me" in the PR description. |
By the way @samsrabin thanks for working on this. It's tricky for all of us, but important to get figured out. If it's helpful to get some of us together to brainstorm let us know. |
This avoids "numpy not found" error for FSURDATMODIFYCTSM, but this isn't a solution for RXCROPMATURITY, because that test actually does need the right conda environment during the run phase (which is when generate_gdds.py is called).
Awesome. Latest update now works for me! PASS FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel CREATE_NEWCASE |
# Import the CTSM Python utilities | ||
_CTSM_PYTHON = os.path.join( | ||
os.path.dirname(os.path.realpath(__file__)), os.pardir, os.pardir, os.pardir, "python" | ||
) | ||
sys.path.insert(1, _CTSM_PYTHON) | ||
import ctsm.crop_calendars.cropcal_utils as utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you speak to why this is needed? In other places we can import stuff throughout the ctsm python library without doing path manipulation - e.g., see job_launcher_qsub.py (for an example in a subdirectory of python/ctsm, like this one is). Is it possible that you need an __init__.py
in the crop_calendars directory to enable this??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point; not necessary! I had that bit in my head from adding it to fsurdatmodifyctsm.py
, but that was only necessary because it's not in python/
. My latest commit removes these unneeded lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I still see the path logic, am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the thing we should discuss as a group.
It's looking like this solution can't be applied to RXCROPMATURITY because it calls external Python scripts during the run phase; see discussion here. I'm leaning towards bringing in the FSURDATMODIFYCTSM solution now, hoping that Derecho magically fixes things for RXCROPMATURITY—I'm not sure banging my head against this any more is worth it for a test that hardly anyone will run. |
That makes sense to me @samsrabin. The RxCropMaturity test will only be run in the ctsm_sci test list right? Or maybe another special test list right? We also might want to try this out on Derecho and see if it's a problem there. If it isn't there isn't a need to worry about this for Cheyenne. Some of these things can be difficult to figure out, so there's a point when we should just punt and move on... |
That's right @ekluzek, it's only run in the I agree that it's worth testing on Derecho, but I guess we have some work to do before that's possible. |
This commit improves organization of cmds_to_setup_conda() and tries to fall back to the original "conda activate" method if "conda run" fails.
Those are necessary for when the crop calendar scripts are being called on their own, from outside the CTSM repo. This reverts commit 9893c80.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have at least one suggestion about a comment. And a question about the test_conda_retry logical. I also reopened the conversation Bill had about the CTSM path stuff, which looks like it's still there. And I think it can be removed in that place as well as others. In any case I'm looking forward to our discussion to go over this.
# Import the CTSM Python utilities | ||
_CTSM_PYTHON = os.path.join( | ||
os.path.dirname(os.path.realpath(__file__)), os.pardir, os.pardir, os.pardir, "python" | ||
) | ||
sys.path.insert(1, _CTSM_PYTHON) | ||
import ctsm.crop_calendars.cropcal_utils as utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I still see the path logic, am I missing something?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samsrabin and I went over this earlier. And he made changes according to my suggestions.
The one thing that I thought we should discuss as a group is how to handle setting the path for python for these type of system tests that now need to manipulate the path. We didn't have this before because the top level tool skeleton handled it. So there it's in one place. Here we need a better way to put it in one place. One way to do that would be to set the path for python using an env variable. I can think of other ways to do it as well. We should decide as a group and then make an issue to change it to that method.
* Add system and unit tests for making fsurdat with all crops everywhere (#2081) * Rework master_list* files etc. (#2087) * Fixes to methane Tech Note (#2091) * Add is_doy_in_interval() function (#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (#2079) * Rework master_list_(no)?fates.rst? (#2083) * conda run -n can fail if a conda environment is already active (#2109) * conda fails to load for SystemTests (#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111) # Conflicts: # src/biogeochem/CNBalanceCheckMod.F90 # src/biogeochem/CNCIsoFluxMod.F90 # src/biogeochem/CNDriverMod.F90 # src/biogeochem/CNPhenologyMod.F90 # src/biogeochem/CNProductsMod.F90 # src/biogeochem/CNVegCarbonFluxType.F90 # src/biogeochem/CNVegNitrogenFluxType.F90 # src/biogeochem/EDBGCDynMod.F90 # src/main/clm_initializeMod.F90 # src/main/controlMod.F90 # src/soilbiogeochem/SoilBiogeochemDecompCascadeBGCMod.F90
b4b changes to Python scripts, history lists, tech note, and clm_time_manager. * Add system and unit tests for making fsurdat with all crops everywhere (ESCOMP#2081) * Rework master_list* files etc. (ESCOMP#2087) * Fixes to methane Tech Note (ESCOMP#2091) * Add is_doy_in_interval() function (ESCOMP#2158) * Avoid using subprocess.run() in FSURDATMODIFYCTSM (ESCOMP#2125) Closes issues: * Add unit test for making fsurdat with all crops everywhere (ESCOMP#2079) * Rework master_list_(no)?fates.rst? (ESCOMP#2083) * conda run -n can fail if a conda environment is already active (ESCOMP#2109) * conda fails to load for SystemTests (ESCOMP#2111)
Issues #2109 and #2111 seem to stem from idiosyncrasies of different users' Cheyenne environments that become important when
subprocess.run()
is called. This PR removes the use ofsubprocess.run()
from the FSURDATMODIFYCTSM test, improving robustness and resolving those issues.Description of changes
Instead of starting a new subprocess shell in which
fsurdat_modifier
is called, this PR makes it so the command is called directly by Python itself. This does require that all the Python dependencies are loaded, which can be accomplished by activating thectsm_pylib
environment before callingrun_sys_tests
orcime/scripts/create_test
.Unfortunately, this method can't be used to fix RXCROPMATURITY failing for some users, even though that's also due to environment. Hopefully that will resolve itself once we move to Derecho.
Specific notes
Remaining tasks (not including testing):
fsurdat_modifier
directly.Contributors other than yourself, if any:
CTSM Issues Fixed:
Are answers expected to change (and if so in what way)? No.
Any User Interface Changes (namelist or namelist defaults changes)? No.
Testing
FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel
:clm_pymods
test suite passes.