Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsurdat_modifier creates a malformatted file on izumi #1658

Closed
billsacks opened this issue Feb 23, 2022 · 15 comments · Fixed by #2416
Closed

fsurdat_modifier creates a malformatted file on izumi #1658

billsacks opened this issue Feb 23, 2022 · 15 comments · Fixed by #2416
Labels
bug something is working incorrectly

Comments

@billsacks
Copy link
Member

Brief summary of bug

When trying to run fsurdat_modifier on izumi with the standard python module loaded, the resulting file cannot be read. ncdump -h gives: ncdump: fsurdat.nc: fsurdat.nc: NetCDF: Unknown file format.

General bug information

CTSM version you are using: Branch a little past master (but I haven't changed the python code in this branch)

Does this bug cause significantly incorrect results in the model's science? No

Configurations affected: Just fsurdat_modifier on izumi

Details of bug

I was hoping we could run the FSURDATMODIFIER test on izumi. So I first tried running FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.izumi_intel. This failed.

Then I loaded a module environment on izumi that is close to what is loaded by default for running CTSM / CESM on izumi:

  1. lang/python/3.7.0 2) mpi/2.3.3/intel/20.0.1 3) tool/hdf5/1.12.0/intel/20.0.1 4) tool/netcdf/4.7.4/intel/20.0.1 5) compiler/intel/20.0.1

I then tried running the tool manually: from the case directory of this test (so the .cfg file existed there), I first removed the previously-generated fsurdat.nc and then ran:

python ~/ctsm_code/ctsm/tools/modify_fsurdat/fsurdat_modifier -v modify_fsurdat.cfg.

This had the same result – even though I got a message saying that the surface dataset was successfully generated.

This is possibly (likely?) a system configuration issue – e.g., maybe the versions of xarray and other packages need to be updated on izumi??? It isn't critical that this is resolved; I'm okay with it being closed as a wontfix if others feel we don't care about supporting this tool on izumi.

@slevisconsulting - thoughts? Or @negin513 would you want to try this with your python environment on izumi?

@billsacks billsacks added the bug something is working incorrectly label Feb 23, 2022
@slevis-lmwg
Copy link
Contributor

I would not close as a wontfix for now. Let's first see how the cssi project makes use of the fsurdat_modifier tool. This may give us insight into whether we need it to work on izumi or not.

@slevis-lmwg
Copy link
Contributor

@adrifoster recommended checking the xarray versions on cheyenne (where fsurdat_modifier works) vs. izumi (where fsurdat_modifier generates a malformatted .nc file). I see differences in libhdf5, libnetcdf, xarray, h5py that seem related. Can others tell whether one or more of these will explain the problem? @adrifoster @negin513 @billsacks @ekluzek @jedwards4b

Cheyenne

python: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)
[GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 4.12.14-95.51-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 0.20.2
pandas: 1.3.5
numpy: 1.21.6
scipy: 1.7.3
netCDF4: 1.5.8
pydap: installed
h5netcdf: None
h5py: 3.6.0

Izumi

python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-1062.12.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.ISO-8859-1
LOCALE: en_US.ISO8859-1

xarray: 0.10.9
pandas: 0.23.4
numpy: 1.15.1
scipy: 1.1.0
netCDF4: None
h5netcdf: None
h5py: 2.8.0

If we're still stuck, @adrifoster recommended posting this issue on the xarray github page.

One more thing that may or may not be related. I tried ./manage_python_env on izumi and got this error (while it seems to work on cheyenne):

Use conda to install the python environment needed to run the CTSM python tools in the conda environment: ctsm_py
Create ctsm_py

CustomValidationError: Parameter channel_priority = 'strict' declared in <<merged>> is invalid.
The value 'strict' cannot be boolified.

Install ctsm_py this can take a long time, be patient....

CustomValidationError: Parameter channel_priority = 'strict' declared in <<merged>> is invalid.
The value 'strict' cannot be boolified.

Trouble installing the ctsm_py python environment

@slevis-lmwg
Copy link
Contributor

Confirmed (in case I was remembering wrong in yesterday's conversation) that the fsurdat file generated on izumi is unreadable also on cheyenne.

@jedwards4b
Copy link
Contributor

Maybe a request to system support to try reinstalling or installing an update on izumi?

@slevis-lmwg
Copy link
Contributor

@negin513 uses a more up-to-date environment on izumi and showed me that she gets an error that I'm not seeing when I run fsurdat_modifier with my environment. I will try updating my environment and pursue things from there.

@negin513
Copy link
Contributor

negin513 commented Jun 30, 2022

I talked with @slevisconsulting today about this.
I have seen this malformed file issue previously and it happens when writing the netcdf file is not completed successfully.

I suggested @slevisconsulting checking the file size against the successful version from Cheyenne. On Izumi, the netcdf file was smaller which verified that the writing stage of netcdf is not completed successfully.

@negin513
Copy link
Contributor

I tested the code with an updated version of xarray on Izumi (xarray 0.20.2) and it complained about the index size for lsmlat dimension. Here is the error:

INFO: Opening fsurdat_in file to be modified: /fs/cgd/data0/slevis/git/mksurfdata_toolchain/tools/modify_fsurdat/fill_indianocean/surfdata_0.9x1.25_hist_16pfts_Irrig_CMIP6_simyr2000_c190214.nc
Traceback (most recent call last):
  File "/home/negins/ctsm_forsam/tools/modify_fsurdat/./fsurdat_modifier", line 68, in <module>
    main()
  File "/home/negins/ctsm_forsam/tools/modify_fsurdat/../../python/ctsm/modify_fsurdat/fsurdat_modifier.py", line 34, in main
    fsurdat_modifier(args.cfg_path)
  File "/home/negins/ctsm_forsam/tools/modify_fsurdat/../../python/ctsm/modify_fsurdat/fsurdat_modifier.py", line 115, in fsurdat_modifier
    modify_fsurdat.set_idealized()  # set 2D variables
  File "/home/negins/ctsm_forsam/tools/modify_fsurdat/../../python/ctsm/modify_fsurdat/modify_fsurdat.py", line 285, in set_idealized
    self.setvar_lev0('FMAX', max_sat_area)
  File "/home/negins/ctsm_forsam/tools/modify_fsurdat/../../python/ctsm/modify_fsurdat/modify_fsurdat.py", line 230, in setvar_lev0
    self.file[var] = self.file[var].where(
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/common.py", line 1291, in where
    return ops.where_method(self, cond, other)
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/ops.py", line 176, in where_method
    return apply_ufunc(
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/computation.py", line 1165, in apply_ufunc
    return apply_dataarray_vfunc(
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/computation.py", line 274, in apply_dataarray_vfunc
    args = deep_align(
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/alignment.py", line 435, in deep_align
    aligned = align(
  File "/home/negins/miniconda3/envs/neon/lib/python3.9/site-packages/xarray/core/alignment.py", line 323, in align
    raise ValueError(f"indexes along dimension {dim!r} are not equal")
ValueError: indexes along dimension 'lsmlat' are not equal

@slevis-lmwg
Copy link
Contributor

@negin513 our conversation led to a breakthrough, thank you:
I had been loading python with
module load lang/python/3.7.0
After our conversation I looked for ways to update xarray, so I unloaded python and activated the ctsm conda environment (just to try):
conda activate ctsm
This worked! Now the tool generates an fsurdat that ncdump doesn't complain about.

PS. I don't get the error that we got when you ran the tool on izumi, which is a relief because I also don't get the error on cheyenne. I'm not sure what we did wrong that you got the error, but I don't think that we should worry about it, since my version works. Thanks, Negin!

@slevis-lmwg
Copy link
Contributor

I'm not sure what we did wrong that you got the error

I'm almost certain that I pointed you to the wrong branch. (Again, it doesn't matter, but the thought was bugging me...)

@negin513
Copy link
Contributor

Perfect! Glad it helped. 👍

@slevis-lmwg
Copy link
Contributor

Sanity check:
I have also confirmed that the fsurdat generated on izumi is bfb identical to the fsurdat generated on cheyenne.

@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented Aug 3, 2022

@ekluzek my branch is located on izumi here:
/fs/cgd/data0/slevis/git/mksurfdata_toolchain/tools/modify_input_files

conda activate ctsm_py uses xarray 0.16
and generates the unreadable .nc file, while

module unload lang/python
conda activate ctsm

has xarray 0.20 and generates a good .nc file.

Rather than looking into the strange behavior requiring the module unload, we should focus on upgrading ctsm_py's xarray to 0.20.

@slevis-lmwg
Copy link
Contributor

@ekluzek here's the contents of the ctsm package on izumi:

# packages in environment at /home/slevis/anaconda/envs/ctsm:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
cftime                    1.5.1.1          py38h6c62de6_1    conda-forge
curl                      7.80.0               h494985f_1    conda-forge
hdf4                      4.2.15               h10796ff_3    conda-forge
hdf5                      1.12.1          nompi_h7f166f4_103    conda-forge
importlib-metadata        4.8.2            py38h578d9bd_0    conda-forge
importlib_metadata        4.8.2                hd8ed1ab_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
krb5                      1.19.2               h48eae69_3    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libblas                   3.9.0           12_linux64_openblas    conda-forge
libcblas                  3.9.0           12_linux64_openblas    conda-forge
libcurl                   7.80.0               h494985f_1    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 11.2.0              h1d223b6_11    conda-forge
libgfortran-ng            11.2.0              h69a702a_11    conda-forge
libgfortran5              11.2.0              h5c6108e_11    conda-forge
libgomp                   11.2.0              h1d223b6_11    conda-forge
liblapack                 3.9.0           12_linux64_openblas    conda-forge
libnetcdf                 4.8.1           nompi_hb3fd0d9_101    conda-forge
libnghttp2                1.43.0               ha19adfc_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libssh2                   1.10.0               ha35d2d1_2    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_11    conda-forge
libzip                    1.8.0                h1c5bbd1_1    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
netcdf4                   1.5.8           nompi_py38h2823cc8_101    conda-forge
numpy                     1.21.4           py38he2449b9_0    conda-forge
openssl                   3.0.0                h7f98852_2    conda-forge
pandas                    1.3.4            py38h43a58ef_1    conda-forge
pip                       21.3.1             pyhd8ed1ab_0    conda-forge
python                    3.8.12          hf930737_2_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.8                      2_cp38    conda-forge
pytz                      2021.3             pyhd8ed1ab_0    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
setuptools                59.4.0           py38h578d9bd_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.37.0               h9cd32fc_0    conda-forge
tk                        8.6.11               h27826a3_1    conda-forge
typing_extensions         4.0.1              pyha770c72_0    conda-forge
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
xarray                    0.20.1             pyhd8ed1ab_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zipp                      3.6.0              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h36c2ea0_1013    conda-forge

@ekluzek
Copy link
Collaborator

ekluzek commented Aug 8, 2022

@slevisconsulting thanks for reporting on your exact conda environment. I played around with it to try to get a version of pu create env that would work. I couldn't get it to work for python 3.7.0 no matter what settings of other packages I tried being informed by your environment. An important part here is that the python version is 3.8. You are really using 3.7.0, but the conda environment thinks it's different from that. So even the working version is somewhat malformed.

But this is really due to the issues of the installed conda version on CGD systems. And I'm working with Joseph to fix those issues.

@ekluzek
Copy link
Collaborator

ekluzek commented Feb 22, 2024

I think this will be resolved in the new conda environment on izumi at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is working incorrectly
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants