Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scripts to run containerized model outside of FRE #136

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

uwagura
Copy link
Collaborator

@uwagura uwagura commented Jan 24, 2025

This PR creates a directory with the needed scripts to compile and run theCEFI_NWA12_COBALT_V1 experiment outside of FRE and in a container. Right now, the workflow is very gaea specific, but it should easily be adaptable for other systems.

@uwagura uwagura requested a review from yichengt900 January 24, 2025 14:42
&& echo ' install_tree: /opt/software' \
&& echo ' view: /opt/views/view') > /opt/spack-environment/spack.yaml
# Install the software, remove unnecessary deps
RUN . /opt/spack/share/spack/setup-env.sh && cd /opt/spack-environment && spack compiler add
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated line

git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/ice_param.git sis2 && \
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/land_null.git land_null && \
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/atmos_null.git atmos_null && \
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/FMScoupler.git coupler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the CEFI-regional-MOM6 main repo contains all the sub-components needed to build the mom6-sis2-cobalt model, we do not need to clone the sub-components individually. Consider removing them to ensure users build the model using the recommended tag for each sub-component.

&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \
&& mkdir -p $bld_dir/FMS \
&& list_paths -l -o $bld_dir/FMS/pathnames_FMS $src_dir/FMS \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to $src_dir/mom6/src/FMS to ensure the build process uses the recommended FMS tag from CEFI-regional-MOM6.

&& mkdir -p $bld_dir/FMS \
&& list_paths -l -o $bld_dir/FMS/pathnames_FMS $src_dir/FMS \
&& cd $bld_dir/FMS \
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libFMS.a -t $mkmf_template -c " -DINTERNAL_FILE_NML -g -Duse_libMPI -Duse_netCDF -Duse_yaml -DMAXFIELDMETHODS_=600" -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include $bld_dir/FMS/pathnames_FMS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please consider changing the FMS-related include folder to -Imom6/src/FMS/fms2_io/include -Imom6/src/FMS/include -Imom6/src/FMS/mpp/include. You will probably also need to make similar changes for the other components listed below.

&& mkdir -p $bld_dir/mom6 \
&& list_paths -l -o $bld_dir/mom6/pathnames_mom6 $src_dir/mom6/src/MOM6/config_src/memory/dynamic_symmetric $src_dir/mom6/src/MOM6/config_src/drivers/FMS_cap $src_dir/mom6/src/MOM6/src/*/ $src_dir/mom6/src/MOM6/src/*/*/ $src_dir/mom6/src/MOM6/config_src/external/ODA_hooks $src_dir/mom6/src/MOM6/config_src/external/stochastic_physics $src_dir/mom6/src/MOM6/config_src/external/drifters $src_dir/mom6/src/MOM6/config_src/external/database_comms $src_dir/mom6/src/ocean_BGC/generic_tracers $src_dir/mom6/src/ocean_BGC/mocsy/src $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/modules $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/toolbox $src_dir/mom6/src/MOM6/config_src/infra/FMS2 \
&& cd $bld_dir/mom6 \
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libmom6.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g -DINTERNAL_FILE_NML -DMAX_FIELDS_=100 -DUSE_FMS2_IO -DNOT_SET_AFFINITY -D_USE_MOM6_DIAG -D_USE_GENERIC_TRACER -DUSE_PRECISION=2 " -o "-I$bld_dir/FMS " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include $bld_dir/mom6/pathnames_mom6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider increasing -DMAX_FIELDS_=100 to -DMAX_FIELDS_=600, as the BGC model may require a significant amount of field registry.

@@ -0,0 +1,5 @@
#!/bin/bash
podman build -f Dockerfile -t mom6_sis2_generic_4p_compile_symm_yaml:prod
rm -f mom6_sis2_generic_4p_compile_symm_yaml.tar mom6_sis2_generic_4p_compile_symm_yaml.sif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied over from the fre script to make sure the sif file doesn't exist before writing it in the next step. I kept it as a precaution, but it can be removed

#!/bin/bash
podman build -f Dockerfile -t mom6_sis2_generic_4p_compile_symm_yaml:prod
rm -f mom6_sis2_generic_4p_compile_symm_yaml.tar mom6_sis2_generic_4p_compile_symm_yaml.sif
podman save -o mom6_sis2_generic_4p_compile_symm_yaml-prod.tar localhost/mom6_sis2_generic_4p_compile_symm_yaml:prod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like Podman—it's a great alternative to Docker, especially for Linux and HPC users. My concern is that regular users may simply want a Dockerfile or Singularity definition file to easily build an image for running the model. I don't have any issues with the run script, but perhaps we could consider adding a Singularity definition file as well, so users can just run apptainer build or singularity build to get the image they need. I'd be happy to provide a Singularity definition script that can do almost the same thing as the Dockerfile you've provided. What are your thoughts on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I can work on creating a singularity definition file as well to provide this capability

@@ -0,0 +1,75 @@
"ATM", "p_surf", "msl", "INPUT/ERA5_msl_1995_padded.nc", "bilinear", 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend using a template for the data_table (e.g., moving it to a template folder) and then overwriting the specific year, rather than always relying on the year-1 approach used in the current run script. This would make the code more flexible and reduce potential errors when dealing with different years.

#
#export APPTAINER_CONTAINLIBS="/usr/lib64/libjansson.so.4,/usr/lib64/libjson-c.so.3,/usr/lib64/libdrm.so.2,/lib64/libtinfo.so.6,/usr/lib64/libnl-3.so.200,/usr/lib64/librdmacm.so.1,/usr/lib64/libibverbs.so.1,/usr/lib64/libibverbs/libmlx5-rdmav34.so,/usr/lib64/libnuma.so.1,/usr/lib64/libnl-cli-3.so.200,/usr/lib64/libnl-genl-3.so.200,/usr/lib64/libnl-nf-3.so.200,/usr/lib64/libnl-route-3.so.200,/usr/lib64/libnl-idiag-3.so.200,/usr/lib64/libnl-xfrm-3.so.200"
#
#export APPTAINER_BIND="/usr/share/libdrm,/var/spool/slurmd,/opt/cray,/opt/intel,/etc/libibverbs.d,/usr/lib64/libibverbs,/usr/lib64/libnl3-200"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the commented-out lines below since they are now in a separate file.


1.) You will have to stage all the necessary input files to the `INPUT/` directory yourself, using the same naming scheme as `CEFI_NWA12_cobalt.xml`. All input files are available on gaea, and the `run_model.sh` script will stage annual `ERA5` and `GloFAS` runoff forcings for you if you provide a path to a directory where these files are located. You will have to manually move the other files your self. If on gaea, or a system with access to gaea, you can stage the necesarray in puts with the following commands:
```
cp /gpfs/f5/cefi/scratch/Utheri.Wagura/DockerfileTest/INPUT ./INPUT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also make a copy of those inputs under global-shared folder: /gpfs/f5/icefi/world-shared/datasets/container_input/NWA and /gpfs/f6/ira-cefi/world-shared/datasets/container_input/NWA.

Copy link
Contributor

@yichengt900 yichengt900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uwagura, sorry for the late response. I did a quick first-round review. I like the idea of having a containerized model and a simple bash script for external users to conduct cycle retrospective runs over multiple years. I’ve left some comments on the container build-related scripts, and I think we can start with those before moving on to the others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants