-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with new EPIC modulefiles #458
Comments
@danielabdi-noaa I'm able to replicate the same behavior you are encountering on Hera. The only way to submit the Looking at the old TCL modulefile for 4.5.12, I see:
before anything else is done. I don't see this in the new Lua modulefile for 4.12.0. Could this be the reason that this worked before, but not after moving to the EPIC maintained stack? While I'm unable to replicate what you are encountering on Orion, I think I have identified the issue. Looking at
No permissions are set for users who aren't using the EPIC role account or have access to the EPIC account. This is also an issue at the epic-ps level:
Tagging @natalie-perlin to see if she can change the permission of role-epic-ps and miniconda3 on Orion from:
to
and to see if she knows why the use of |
@MichaelLueken I just tried Jet now and the same thing happens there too, so I have updated the affected machine's list. Note that this behaviour did not happen after transition to Lua modulefiles, so I don't think it is related to TCL vs Lua modulefiles. If I go back 1 commit before #444 everything seems to work. The problem is exactly same as before, it is using the wrong python3.
in
And the PATH has the /mnt/lfs4/HFIP/hfv3gfs/role.epic/miniconda3/4.12.0/condabin:/mnt/lfs4/HFIP/hfv3gfs/role.epic/miniconda3/4.12.0/bin:/apps/rocoto/1.3.3/bin:/apps/local/bin:/mnt/lfs4/HFIP/hfv3gfs/role.epic/miniconda3/4.12.0/envs/regional_workflow/bin:/lfs4/HFIP/hfv3gfs/nwprod/hpc-stack/libs/intel-2022.1.2/nccmp/1.8.9.0/bin:/lfs4/HFIP/hfv3gfs/nwprod/hpc-stack/libs/intel-2022.1.2/prod_util/1.2.2/bin: .... I think the conda activate-deactivate trick may solve it but I wish there was a better solution. |
@danielabdi-noaa Sorry about that. When I was speaking about Lua vs TCL, I meant the miniconda3 modulefile, not the wflow_* modulefiles. In
before miniconda3 is even loaded, there is an unload. I'm wondering if this is the reason why there weren't issues on Hera and Jet previously. Before miniconda3 was loaded, it ensured that a previously loaded miniconda3 is being unloaded first. If this is done in
between:
would the expected behavior return? |
@MichaelLueken I see. I haven't looked at the miniconda3 modulefile itself but I do agree a modification there could potentially solve the problem. I've quickly tried to do unload("miniconda3") before loading it in |
@MichaelLueken This seems to work for me modifying
|
@danielabdi-noaa @MichaelLueken - The main issue with this approach is that the |
@MichaelLueken - permissions adjusted on Orion using
|
@natalie-perlin The
|
@danielabdi-noaa - |
@natalie-perlin Thanks! It works for me now. There is some odd message printed when i tried to activate conda, but is most likely related to Orion being on maintainance so will probably go away afterwards |
@danielabdi-noaa - what was the message? Was there any relation to the module or virtual environment?.. |
@natalie-perlin It went away after a while. Orion has been very slow these days so I suspected that was the case.
|
Expected behavior
Current behavior
setup_WE2E_tests.sh
is not possible if you already activated condawflow_orion
is no more possibleMachines affected
Hera, Jet and Orion
Steps To Reproduce
a) On orion, I can no longer load the
wflow_orion
module.b) On Hera, if the user already activated conda on the command line, running test cases fail
Previously it used to work whether you had activated conda or not.
This looks similar to the issue solved by
--export=NONE
but this time we will have to prevent exporting of environment to the testing script.The text was updated successfully, but these errors were encountered: