-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiling with hpc-stack a WW3 wave model program produces empty grib2 files #137
Comments
@RobertoPadilla-NOAA can we give them a smaller test case where they don't have to run the whole workflow? Also updates from my fork that you have pointed them to have long since gone back into feature/coupled-crow. I'd prefer that people are not using that anymore. |
Please provide a single standalone script to reproduce the behavior. |
Which hpc-stack are you using? I just ran There was a bug in our wgrib2 build, but it has since been fixed in a newer version of hpc-stack. Here is my output:
|
@kgerheiser those are atm grib files, not wave grib files. I've tried with both 1.0.0 and 1.1.0, without success. |
I thought it might be an issue with the |
Jessica,
I don't have a smaller test case, Do you?
If not, I have to work on building one.
Roberto
…On Tue, Jan 12, 2021 at 12:50 PM Jessica Meixner ***@***.***> wrote:
@RobertoPadilla-NOAA <https://github.com/RobertoPadilla-NOAA> can we give
them a smaller test case where they don't have to run the whole workflow?
Also updates from my fork that you have pointed them to have long since
gone back into feature/coupled-crow. I'd prefer that people are not using
that anymore.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#137 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALZGY3CNC7BVTLD4CRB73SDSZSDWXANCNFSM4V7SAIXA>
.
|
The command used to create the grib file would be a good start |
No @RobertoPadilla-NOAA but you made multiple tests without the workflow so I assumed you have something. I think we want the test to be super small and simple, so you just have the binary output from a model run, the ww3_grib.inp file and then a simple script for building the ww3_grib exe they need to run (1 way with the modules that work from the non-hpc-stack modules) and one w/the hpc-stack modules. |
Ok @kgerheiser @aerorahul , I'll be back to you once I have the small test ready. |
@kgerheiser @aerorahul I was working with the canned test for you, on Hera, but now hpc-stack modules can not be found. This is probably related to the problem of data loss this morning (Do you know if this is true?) |
If they weren't working before they seem to be working now. I just tried loading the modules on Hera. |
I had success loading them on one of the login nodes (hfe11), but compiling on the compute nodes failed. Maybe some compute nodes lost their |
@RobertoPadilla-NOAA you changed Jasper/2.0.15 to Jasper/1.900.1, or 1.900.1 to 2.0.15? That's something that should be investigated. |
@kgerheiser on Orion I changed jasper/2.0.15 to jasper/1.900.1 in order to build ww3_grib properly. |
On Hera I'm working on scrath1, and I'm compiling on the login nodes hfe04 and hfe10 and loading hpc-stack fails. |
In what way does it fail? I'm on I run:
And it works. |
I don't know what is happening Ok, now I changed the path and loading (on the command line) Please check the spelling or version number. Also try "module spider ..." Also make sure that all modulefiles written in TCL start with the string #%Module Thanks, |
Please note that there was a filesystem problem last night, resulting in about 45TB of corrupted=lost data. |
[Roberto.Padilla@hfe10 Run_test]$ module use /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/modulefiles/stack Please check the spelling or version number. Also try "module spider ..." |
@RobertoPadilla-NOAA
|
@climbfuji, yes, that was my question in the first comments of today, that if the filesystem failure was affecting the loading of hpc-stack?. |
@RobertoPadilla-NOAA The correct use of hpc-stack and the software stack underneath it is outlined here |
The new version of hpc-stack also has updated some libraries (like bacio is version 2.4.1 now), so that's why it's not finding the versions you specified. The updated libraries should have no affect on your code or have any change in results (mainly build system changes). |
@RobertoPadilla-NOAA do I need to help make the test case or a file using the new module versions of hpc-stack? |
I'm not sure about what you were originally using at That seems to be an old version of hpc-stack (which would also contribute to your wgrib2 problem). I would update to use the most recent version of hpc-stack if you can, if to just get the updated wgrib2. |
@JessicaMeixner-NOAA if you can help making the file with the new module versions of hpc-stack will be great. Thanks. |
Try this:
|
@kgerheiser it failed to load only g2 |
@RobertoPadilla-NOAA |
Looks like there's a new release of Jasper, 2.0.25, with the fix. I think we should update to that immediately, and we'll continue to look at phasing out Jasper. |
@JessicaMeixner-NOAA or @WalterKolczynski-NOAA would you try out my nightly build of hpc-stack (develop)? I just want to make sure that the fix works before we install it everywhere.
I have it built on Hera and Orion and it contains Jasper 2.0.25. |
@kgerheiser What about WCOSS Dell? |
I don't have a test build on there at the moment. I can do one if you like. I have a cron job set to build and test hpc-stack, but cron doesn't work on WCOSS Dell. |
On Hera:
|
The ESMF thing doesn't matter. That's a good catch. We recently fixed that in the code so it wasn't hardcoded, but wgrib2 was missed. I have fixed it in the existing build. |
I've never had a problem with cron on WCOSS Dell. Are you using the mycrontab file? |
No, how do I do that? |
In your home directory, there should be a cron directory with a file named mycrontab inside. Works just like editing a normal crontab, except it will automatically be turned on/off when production switches (and you don't have to play 'which login node did I put the cron job on?'). |
@WalterKolczynski-NOAA do you have the testing done? I could run my quick test set-up for this case on orion if that would help. I'm just switching out the jasper or did you want me to use the whole hpc-stack from the nightly build? |
Just use the whole hpc-stack. Everything should work. |
On Hera, a bunch more wrong envvar libs:
|
I'm trying to get everything built and setup now |
Yep, just realized that would happen. Sorry, about that. I fixed them. |
It looks like Orion has the same lib variable issues. |
@kgerheiser on my test on orion, I'm getting that the following two variables which I use when building the model: G2_LIB4=/work/noaa/stmp/gkyle/stmp/gkyle/hpc-stack/nightly-develop/install/intel-2018.4/g2/3.4.1/lib64/libg2_4.a don't actually exist. The modules I used: |
I believe I have fixed all the modules in both of the builds. I also put in a PR #163 to fix it. |
@JessicaMeixner-NOAA looks like the pio version has to be updated to 2.5.2 as well |
PIO 2.5.1 will also be there, but 2.5.2 is now the version we're moving to. Feel free to remain on 2.5.1 for now. |
It isn't available in the nightly build, which makes it difficult to test without changing. |
I don't need pio to test, but I do need the libraries to exist/link to, to be able to test ww3_grib. |
I needed it to build the model. |
I've successfully built on both Hera and Orion using the nightly build. |
@WalterKolczynski-NOAA what do you use for G2_LIB4 and W3NCO_LIB4 ? |
I didn't make any changes to the model except the jasper and pio versions. The build log says: G2_LIB4=/apps/contrib/NCEP/libs/hpc-stack/intel-2018.4/g2/3.4.1/lib/libg2_4.a |
I ran a test on orion and it worked for my test case @kgerheiser sorry it took a while |
Updates the jasper version to fix a bug that was causing constant-valued grib files for wave output. Refs: NOAA-EMC#161 NOAA-EMC#164 NOAA-EMC/hpc-stack#137
Describe the bug
The use of the hpc-stack to compile a program to produce grib2 files from wave-component of the coupled system allows to build the executable but this executable produces empty grib2 files.
If the hpc-stack is not used and modules are loaded separately then the executable produces valid grib2 files.
To Reproduce
==Install and run the test as described at
https://github.com/NOAA-EMC/global-workflow/blob/feature/coupled-crow/README.md
== Except for the first steps "Checkout the source code and scripts"
== Use the following instructions
git clone https://github.com/Jessica-Meixner-NOAA/global-workflow coupled-workflow
cd coupled-workflow
git checkout feature/p5ww3post
git submodule update --init --recursive #Update submodules
==Follow the instructions .
cd sorc
sh checkout.sh coupled # Check out the coupled code, EMC_post, gsi, ...
sh build_ncep_post.sh #This command will build ncep_post
sh build_ww3prepost.sh #This command will build ww3 prep and post exes
sh build_fv3_coupled.sh #This command will build ufs-s2s-model
sh build_reg2grb2.sh #This command will build exes for ocean-ice post
=To link fixed files and executable programs for the coupled application:
=On Hera:
sh link_fv3gfs.sh emc hera coupled
=On Orion:
sh link_fv3gfs.sh emc orion coupled
d ../workflow
cp user.yaml.default user.yaml
=Then, open and edit user.yaml:
=EXPROOT: Place for experiment directory, make sure you have write access.
=FIX_SCRUB: True if you would like to fix the path to ROTDIR(under COMROOT) and RUNDIR(under DATAROOT) False if you would like CROW to detect available disk space automatically. *** Please use FIX_SCRUB: True on Hera/Orion until further notice (2020/03)
=COMROOT: Place to generate ROTDIR for this experiment.
=DATAROOT: Place for temporary storage for each job of this experiment.
=cpu_project: cpu project that you are working with.
=hpss_project: hpss project that you are working with.
==IMPORTANT, next step is different from the git page
=In HERA
./setup_case.sh -p HERA ../cases/coupled_free_forecast_wave.yaml test2d
=In ORION
./setup_case.sh -p ORION ../cases/coupled_free_forecast_wave.yaml test2d
=This will create a experiment directory ($EXPERIMENT_DIRECTORY). In the current example, $EXPERIMENT_DIRECTORY=$EXPROOT/test2d.
=For ORION: First make sure you have python loaded:
module load contrib
module load rocoto #Make sure to use 1.3.2
module load intelpython3
./make_rocoto_xml_for.sh $EXPERIMENT_DIRECTORY
=Run the model using the workflow
cd $EXPERIMENT_DIRECTORY
module load rocoto
=Run several time rocotorun until all process are done
rocotorun -w workflow.xml -d workflow.db
=Check the status of your test
rocotostat -w workflow.xml -d workflow.db
=You'll find the grib2 files in the directory you created:
cd $COMROOT/test2d/gfs.20130401/00/wave/gridded
wgrib2 -V gfswave.t00z.global.0p50.f000.grib2
=You'll see that all wave variables (min, averge, max) have the same value.
Expected behavior
Produce valid grib2 files from the wave model using the modules from the phc-stack.
System:
Hera and Orion
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: