-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Derecho transition: Tests and test infrastructure #1995
Comments
I took the liberty of adding some items to your list; I hope you don't mind. |
Derecho (derecho.hpc.ucar.edu) will be available for all next week. But, I am able to login today. We talked about derecho in CSEG, so I'm updating the above list. Some points:
|
Note, that intel-classic and intel will be deprecated, and intel-oneapi will eventually be the only option. intel-oneapi is using the LLVM based compilers for both C and FORTRAN. |
I ran my first test on derecho to just see what would happen. So ran SMS_D_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel-oneapi.clm-default The default PE layout becomes a problem immediately with this error in cesm.log:
The default PE layout is for 256 processors which is 2 nodes. So I'll try with one node. |
Using one node 128 processors, it now dies with the following...
|
Thanks for starting to work on this, @ekluzek ! My understanding is that intel-oneapi is somewhat bleeding edge and that the default will probably be standard "intel" for now. So I think most of our testing should be on "intel"; I wasn't clear from Tuesday's CSEG meeting whether it's worth having testing for intel-oneapi in addition, but I think we should follow the lead of Jim & Chris on that. |
From today's ctsm-software discussion: tentative thought is to try to have the test list switched over by end of September. When we're ready to switch this over, we'll stop testing on cheyenne (not try to have aux_clm tests run on both cheyenne and derecho at once, because that's a pain). |
This article confirms a end of year timeline for Cheyenne to be shutdown for good... |
Running the straight up intel compiler option on one node PASSes as we'd hope: SMS_D_Ld3_P128x1.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default |
From today's ctsm software meeting:
|
@ekluzek can you create a |
I've added the directory. And I've started asking individuals to move their stuff over. I've got sent to most people. There's a few others to ask. Also Jackie has a directory under there what should happen to it? I don't know if Jackie still has access? |
@fischer-ncar points out f10 tests are failing on Derecho because we don't have this setup in our config_pes file:
|
We are going to let these tests fail on Derecho for CESM until we can get to working on this. |
@fischer-ncar points out another test that failed. This probably goes to show that we'll need to change our threading tests to PE counts that will work well on Derecho.
|
@slevis-lmwg and @ekluzek: I've moved two of the items from this issue to standalone issues and edited the original post here to reflect that. Then, because there are a lot of posts here about testing, I left this as an omnibus issue for test-related work. Hope that's okay! |
I think @samsrabin mentioned this and I questioned it, but he is right, you can't use derecho out of the box on derecho. You need to update ccs_config to at least ccs_config_cesm0.0.72. We are about 5 tags before that one. |
I made the changes I saw for run_sys_tests on the CESM3_dev branch, but it wasn't working. With @billsacks help I was able to diagnose it and get it working there. So that can readily be moved to main-dev. |
The text was updated successfully, but these errors were encountered: