-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scripts to run containerized model outside of FRE #136
base: main
Are you sure you want to change the base?
Conversation
&& echo ' install_tree: /opt/software' \ | ||
&& echo ' view: /opt/views/view') > /opt/spack-environment/spack.yaml | ||
# Install the software, remove unnecessary deps | ||
RUN . /opt/spack/share/spack/setup-env.sh && cd /opt/spack-environment && spack compiler add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated line
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/ice_param.git sis2 && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/land_null.git land_null && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/atmos_null.git atmos_null && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/FMScoupler.git coupler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the CEFI-regional-MOM6
main repo contains all the sub-components needed to build the mom6-sis2-cobalt
model, we do not need to clone the sub-components individually. Consider removing them to ensure users build the model using the recommended tag for each sub-component.
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/FMS \ | ||
&& list_paths -l -o $bld_dir/FMS/pathnames_FMS $src_dir/FMS \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch to $src_dir/mom6/src/FMS
to ensure the build process uses the recommended FMS tag from CEFI-regional-MOM6.
&& mkdir -p $bld_dir/FMS \ | ||
&& list_paths -l -o $bld_dir/FMS/pathnames_FMS $src_dir/FMS \ | ||
&& cd $bld_dir/FMS \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libFMS.a -t $mkmf_template -c " -DINTERNAL_FILE_NML -g -Duse_libMPI -Duse_netCDF -Duse_yaml -DMAXFIELDMETHODS_=600" -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include $bld_dir/FMS/pathnames_FMS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, please consider changing the FMS-related include folder to -Imom6/src/FMS/fms2_io/include -Imom6/src/FMS/include -Imom6/src/FMS/mpp/include
. You will probably also need to make similar changes for the other components listed below.
&& mkdir -p $bld_dir/mom6 \ | ||
&& list_paths -l -o $bld_dir/mom6/pathnames_mom6 $src_dir/mom6/src/MOM6/config_src/memory/dynamic_symmetric $src_dir/mom6/src/MOM6/config_src/drivers/FMS_cap $src_dir/mom6/src/MOM6/src/*/ $src_dir/mom6/src/MOM6/src/*/*/ $src_dir/mom6/src/MOM6/config_src/external/ODA_hooks $src_dir/mom6/src/MOM6/config_src/external/stochastic_physics $src_dir/mom6/src/MOM6/config_src/external/drifters $src_dir/mom6/src/MOM6/config_src/external/database_comms $src_dir/mom6/src/ocean_BGC/generic_tracers $src_dir/mom6/src/ocean_BGC/mocsy/src $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/modules $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/toolbox $src_dir/mom6/src/MOM6/config_src/infra/FMS2 \ | ||
&& cd $bld_dir/mom6 \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libmom6.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g -DINTERNAL_FILE_NML -DMAX_FIELDS_=100 -DUSE_FMS2_IO -DNOT_SET_AFFINITY -D_USE_MOM6_DIAG -D_USE_GENERIC_TRACER -DUSE_PRECISION=2 " -o "-I$bld_dir/FMS " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include $bld_dir/mom6/pathnames_mom6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider increasing -DMAX_FIELDS_=100
to -DMAX_FIELDS_=600
, as the BGC model may require a significant amount of field registry.
@@ -0,0 +1,5 @@ | |||
#!/bin/bash | |||
podman build -f Dockerfile -t mom6_sis2_generic_4p_compile_symm_yaml:prod | |||
rm -f mom6_sis2_generic_4p_compile_symm_yaml.tar mom6_sis2_generic_4p_compile_symm_yaml.sif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied over from the fre
script to make sure the sif file doesn't exist before writing it in the next step. I kept it as a precaution, but it can be removed
#!/bin/bash | ||
podman build -f Dockerfile -t mom6_sis2_generic_4p_compile_symm_yaml:prod | ||
rm -f mom6_sis2_generic_4p_compile_symm_yaml.tar mom6_sis2_generic_4p_compile_symm_yaml.sif | ||
podman save -o mom6_sis2_generic_4p_compile_symm_yaml-prod.tar localhost/mom6_sis2_generic_4p_compile_symm_yaml:prod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like Podman—it's a great alternative to Docker, especially for Linux and HPC users. My concern is that regular users may simply want a Dockerfile or Singularity definition file to easily build an image for running the model. I don't have any issues with the run script, but perhaps we could consider adding a Singularity definition file as well, so users can just run apptainer build
or singularity build
to get the image they need. I'd be happy to provide a Singularity definition script that can do almost the same thing as the Dockerfile you've provided. What are your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I can work on creating a singularity definition file as well to provide this capability
@@ -0,0 +1,75 @@ | |||
"ATM", "p_surf", "msl", "INPUT/ERA5_msl_1995_padded.nc", "bilinear", 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend using a template for the data_table
(e.g., moving it to a template folder) and then overwriting the specific year, rather than always relying on the year-1
approach used in the current run script. This would make the code more flexible and reduce potential errors when dealing with different years.
# | ||
#export APPTAINER_CONTAINLIBS="/usr/lib64/libjansson.so.4,/usr/lib64/libjson-c.so.3,/usr/lib64/libdrm.so.2,/lib64/libtinfo.so.6,/usr/lib64/libnl-3.so.200,/usr/lib64/librdmacm.so.1,/usr/lib64/libibverbs.so.1,/usr/lib64/libibverbs/libmlx5-rdmav34.so,/usr/lib64/libnuma.so.1,/usr/lib64/libnl-cli-3.so.200,/usr/lib64/libnl-genl-3.so.200,/usr/lib64/libnl-nf-3.so.200,/usr/lib64/libnl-route-3.so.200,/usr/lib64/libnl-idiag-3.so.200,/usr/lib64/libnl-xfrm-3.so.200" | ||
# | ||
#export APPTAINER_BIND="/usr/share/libdrm,/var/spool/slurmd,/opt/cray,/opt/intel,/etc/libibverbs.d,/usr/lib64/libibverbs,/usr/lib64/libnl3-200" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented-out lines below since they are now in a separate file.
|
||
1.) You will have to stage all the necessary input files to the `INPUT/` directory yourself, using the same naming scheme as `CEFI_NWA12_cobalt.xml`. All input files are available on gaea, and the `run_model.sh` script will stage annual `ERA5` and `GloFAS` runoff forcings for you if you provide a path to a directory where these files are located. You will have to manually move the other files your self. If on gaea, or a system with access to gaea, you can stage the necesarray in puts with the following commands: | ||
``` | ||
cp /gpfs/f5/cefi/scratch/Utheri.Wagura/DockerfileTest/INPUT ./INPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also make a copy of those inputs under global-shared folder: /gpfs/f5/icefi/world-shared/datasets/container_input/NWA
and /gpfs/f6/ira-cefi/world-shared/datasets/container_input/NWA
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uwagura, sorry for the late response. I did a quick first-round review. I like the idea of having a containerized model and a simple bash script for external users to conduct cycle retrospective runs over multiple years. I’ve left some comments on the container build-related scripts, and I think we can start with those before moving on to the others.
This PR creates a directory with the needed scripts to compile and run the
CEFI_NWA12_COBALT_V1
experiment outside ofFRE
and in a container. Right now, the workflow is verygaea
specific, but it should easily be adaptable for other systems.