Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial sub-setting of regional source grids as part of the mkmapdata process #806

Closed
ekluzek opened this issue Sep 25, 2019 · 32 comments
Assignees
Labels
blocked: dependency Wait to work on this until dependency is resolved closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix enhancement new capability or improved behavior of existing capability

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 25, 2019

This relates to #643 and is dependent on it.

In order for OCGIS to work reasonably for regional grids, a first step of the process needs to run OCGIS to sub-set the grids to just the area needed for the destination grid. This will improve performance of the process, especially in terms of memory.

@bekozi has this as an option in OCGIS, but he needs to get it working to preserve the global indices, because we'll need that for mksurfdata_map. So we are also dependent on that issue in OCGIS (NCPP/ocgis#494).

@negin513 @slevisconsulting @bekozi

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability blocked: dependency Wait to work on this until dependency is resolved labels Sep 25, 2019
@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Sep 30, 2019
@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Sep 30, 2019
@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented Nov 13, 2019

@bekozi sorry to hit you with two git issues at once...

I'm getting an error when I try subsetting with this script (on cheyenne):
/glade/work/slevis/ocgis_work/do-this-20191113.sh

I saw your comment that subsetting works in serial only (unless I misunderstood).

  • So I removed the mpirun command, but I get this error:
    CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications
  • So I changed the mpirun command to mpirun -np 1, and got this error:
    + mpirun -np 1 ocli chunked-rwg --source /glade/p/cesm/cseg/inputdata/lnd/clm2/mappingdata/grids/SCRIPgrid_64x128_nomask_c110308.nc --destination /glade/p/cesm/cseg/inputdata/lnd/clm2/mappingdata/grids/SCRIPgrid_1x1pt_vancouverCAN_nomask_c110308.nc --spatial_subset --no_genweights --spatial_subset_path /glade/work/slevis/ocgis_work//subsets/spatial_subset.nc
    /glade/work/slevis/git_ocgis/ocgis/src/ocgis/util/logging_ocgis.py:74: OcgWarning: env.USE_NETCDF4_MPI is False. Considerable performance gains are possible if this is True. Is netCDF4-python built with parallel support? warn(exc)
    Traceback (most recent call last):
    File "/glade/u/apps/ch/opt/python/3.6.8/gnu/8.3.0/pkg-library/20190627/bin/ocli", line 135, in chunked_rwg _write_spatial_subset_(rd_src, rd_dst, spatial_subset_path, src_resmax=src_resolution)
    File "/glade/u/apps/ch/opt/python/3.6.8/gnu/8.3.0/pkg-library/20190627/bin/ocli", line 251, in _write_spatial_subset_ with grid_abstraction_scope(dst_field.grid, Topology.POLYGON):
    File "/glade/u/apps/ch/opt/python/3.6.8/gnu/8.3.0/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen)
    File "/glade/work/slevis/git_ocgis/ocgis/src/ocgis/base.py", line 183, in grid_abstraction_scope orig_abstraction = grid.abstraction
    AttributeError: 'NoneType' object has no attribute 'abstraction'

I have tried pointing to a single-point DST file, a global DST file, as well as a 5x5pt_amazon DST file. Same error...

@slevis-lmwg
Copy link
Contributor

@bekozi I got beyond the prev. error, but I'm getting a new one:
File "/glade/work/slevis/git_ocgis/ocgis/src/ocgis/spatial/geomc.py", line 651, in reduce_global raise ValueError('A coordinate index is required to reduce coordinates.')
I do not get an error when I run with your sample files:
ll1280x1280_grid.esmf.nc
ll1280x1280_grid.esmf.subset.nc

@slevis-lmwg
Copy link
Contributor

My files are SCRIP while yours are ESMFMESH...

@slevis-lmwg
Copy link
Contributor

@ekluzek I reread #648 and realized that @bekozi may not need to address the error that I reported above if I pursue #648 next. In that case #648 is blocking this issue (#806) and I should address it next.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Nov 14, 2019

@slevisconsulting yep makes sense. We have to have SCRIP grid files for gen_domain, but these files for mkmapdata don't have to be. And we are pretty certain that the ESMF unstructured mesh format will be faster. So there is a reason to switch to them.

@bekozi
Copy link

bekozi commented Nov 14, 2019

@slevisconsulting I'll still take a look at the file. The issue looks metadata-related, and it will be good to understand the error regardless.

@slevis-lmwg
Copy link
Contributor

@bekozi I have switched to using ESMFMESH SRC files instead of SCRIP SRC files, and this has helped with the sub-setting. For example this combination of SRC/DST files works:
SRC=/glade/p/cesm/cseg/inputdata/lnd/clm2/mappingdata/grids/UNSTRUCTgrid_0.9x1.25_nomask_c191114.nc
DST=/glade/p/cesm/cseg/inputdata/lnd/clm2/mappingdata/grids/SCRIPgrid_5x5pt_amazon_nomask_c110308.nc
However, the same setup does NOT generate a subset file when I change the SRC file to:
SRC=/glade/p/cesm/cseg/inputdata/lnd/clm2/mappingdata/grids/UNSTRUCTgrid_0.5x0.5_nomask_c191114.nc

Could you take a look, please? In the above test I use this script:
/glade/work/slevis/ocgis_work/do-this-20191115_subset_5x5.sh

@slevis-lmwg
Copy link
Contributor

@bekozi in case this helps...
One diff I see between the UNSTRUCTgrid_0.9x1.25 and UNSTRUCTgrid_0.5x0.5 files is longitudes of (0, 360) in the former and (-180, 180) in the latter. I assume that the code should handle either one.

@bekozi
Copy link

bekozi commented Nov 18, 2019

@slevisconsulting I'll take a look. You are correct that wrapped/unwrapped coordinates should be handled.

@bekozi
Copy link

bekozi commented Nov 18, 2019

@slevisconsulting I pushed a fix to the ocgis master for this. The ESMF Unstructured format did not have a default coordinate system, so I set it to CF-Spherical. Your script runs to completion now. There was a related issue for this (NCPP/ocgis#492). Note, I am still looking into the index remapping for the subset<->global domain. I hope to leverage ESMF directly using arbitrary sequence indices. It will require some feature development for ESMF/ESMPy though.

@slevis-lmwg
Copy link
Contributor

Thank you, @bekozi .

@slevis-lmwg
Copy link
Contributor

@bekozi I'm encountering a new failure with this script:
/glade/work/slevis/ocgis_work/do-this-20191115_subset_1x1.sh

The second DST file that you'll see in there works, but the first one gives this error when running with the first SRC file:
/slevis/ocgis_work//subsets/spatial_subset_191115.nc" does not exist.
...and a diff error when running with the second SRC file that you'll see in the script:
wrap action combination not supported

@bekozi
Copy link

bekozi commented Nov 22, 2019

@slevisconsulting I'm pretty sure the "/slevis/ocgis_work//subsets/spatial_subset_191115.nc" does not exist. error is caused by a missing /slevis/ocgis_work//subsets directory. If that directory is created, the script should work.

For the wrap action combination not supported, is this using the UGRID file as SRC? Thanks!

@slevis-lmwg
Copy link
Contributor

@slevisconsulting I'm pretty sure the "/slevis/ocgis_work//subsets/spatial_subset_191115.nc" does not exist. error is caused by a missing /slevis/ocgis_work//subsets directory. If that directory is created, the script should work.

Hmm, I don't think so because:

  1. there is such a directory, and
  2. the script works when I uncomment the second DST file instead of the first.

For the wrap action combination not supported, is this using the UGRID file as SRC? Thanks!

Not the UGRID file, but the UNSTRUCT file.

@slevis-lmwg
Copy link
Contributor

So to summarize a bit more clearly:
The first DST file gives errors with both SRC files... One error with the UGRID SRC file and a different error with the UNSTRUCT SRC file.

The second DST file works with both SRC files.

@bekozi
Copy link

bekozi commented Nov 22, 2019

Okay - thanks for the additional explanation. I wasn't quite sure how to reproduce the issues.

@bekozi
Copy link

bekozi commented Nov 22, 2019

@slevisconsulting I believe the issue is an incorrect SRCTYPE in the script. If ESMFMESH is used for the UGRID file (which should be SRCTYPE="UGRID") then the metadata interpretation is thrown off. When setting the correct SRCTYPE, the script ran to completion for SCRIPgrid_1x1pt_camdenNJ_nomask_c110308.

@slevis-lmwg
Copy link
Contributor

When setting the correct SRCTYPE, the script ran to completion for SCRIPgrid_1x1pt_camdenNJ_nomask_c110308.

@bekozi this file has been working for me all along. Sorry for the confusion.

To better convey what I'm trying to say, I have split my script into four separate scripts:
subset_1x1_DOESNT_WORK_1.sh <--- Doesn't create a subset file so regridding crashes.
subset_1x1_DOESNT_WORK_2.sh <--- Creates subset file but regridding gives wrap action combination not supported error
subset_1x1_WORKS_1.sh
subset_1x1_WORKS_2.sh

@bekozi
Copy link

bekozi commented Nov 25, 2019

Thanks @slevisconsulting. I was able to reproduce the issue with your new scripts. I think I know what the problem is, and I will get a fix in likely after the holiday.

@bekozi
Copy link

bekozi commented Nov 26, 2019

@slevisconsulting I pushed a fix for the UNSTRUCT failure. The script now generates a weight file with the spatial subset. Be sure and set NCHUNKS_DST=1 since the destination grid only has a single cell. There was an optimization that fixed the spatial wrapping for the subset geometry that I made more flexible.

From what I can tell, the destination grid has no spatial overlap with the source UGRID file. I added some better error handling to check the subcomm for empty subsets. Do you also think this is the case?

@slevis-lmwg
Copy link
Contributor

@slevisconsulting I pushed a fix for the UNSTRUCT failure. The script now generates a weight file with the spatial subset. Be sure and set NCHUNKS_DST=1 since the destination grid only has a single cell. There was an optimization that fixed the spatial wrapping for the subset geometry that I made more flexible.

@bekozi this one now generates the subset file but still doesn't generate a weight file for me. The new error says:
ValueError: ESMC_FieldRegridStoreFile() failed with rc = 18. Please check the log files (named "*ESMF_LogFile")
and in the ESMF_LogFile I get:
20191209 154729.965 ERROR PET0 ESMF_IOScrip.F90:211 ESMF_OutputWeightFile Operation not yet supported - "factorList" has size 0 and PET count is 1. There is nothing to write.
20191209 154729.966 ERROR PET0 ESMF_IOScrip.F90:136 ESMF_SparseMatrixWrite Operation not yet supported - Internal subroutine call returned Error
20191209 154729.966 ERROR PET0 ESMF_Field_C.F90:1269 f_esmf_regridstorefile Operation not yet supported - Internal subroutine call returned Error
20191209 154730.012 INFO PET0 Finalizing ESMF

From what I can tell, the destination grid has no spatial overlap with the source UGRID file. I added some better error handling to check the subcomm for empty subsets. Do you also think this is the case?

I see what you mean. The lat/lon combination from the dst file does not seem to be present in this source UGRID file. Thank you for pointing that out :-)

@bekozi
Copy link

bekozi commented Dec 10, 2019

@slevisconsulting Unfortunately I cannot reproduce the error. Could you send me the esmf.mk file that ESMPy links to? You can get the makefile path with the command:

python -c "import ESMF; print(ESMF.interface.esmfmkfile.ESMFMKFILE)"

If they are useful for debugging, here are the two files my script produced:

It may be worthwhile for me to look at the script you are using. I know you've probably sent this to me already... Can you pass along at your convenience?

@slevis-lmwg
Copy link
Contributor

I did a git fetch and git pull in my git_esmf directory in case my code was old but got the same error.

I then tried
python -c "import ESMF; print(ESMF.interface.esmfmkfile.ESMFMKFILE)"
and got
/glade/work/turuncu/ESMF/8.0.0b40/lib/libO/Linux.intel.64.mpt.default/esmf.mk

I am using this script:
/glade/work/slevis/ocgis_work/subset_1x1_DOESNT_WORK_2.sh

@bekozi
Copy link

bekozi commented Dec 12, 2019

Okay, we are using the same files. Does the spatial subset file I posted match the file generated with your script? Also, could you attach the makefile target to this thread? I would like to confirm which ESMF build you are using. It sounds like you are using the most recent ocgis. Thank you!

@bekozi
Copy link

bekozi commented Dec 12, 2019

Did you set NCHUNKS_DST=1?

@slevis-lmwg
Copy link
Contributor

Ok, so the spatial subset that you posted is identical to mine. And I'm using NCHUNKS_DST=1, though I got exactly the same error when I set it to 2.

Is this what you mean by "makefile target":
export PYTHONPATH="/glade/work/slevis/git_ocgis/ocgis/src/:/glade/work/slevis/git_esmf/esmf/src/addon/ESMPy/src/"

@bekozi
Copy link

bekozi commented Dec 13, 2019

My apologies for the confusion regarding the makefile. I was hoping you could attach the actual file. However, it is easy enough for me to track it down on Cheyenne!

It's good news that the spatial subset files are identical. I'll do some more testing on this end with the weight generation.

@bekozi
Copy link

bekozi commented Dec 13, 2019

I've reproduced the error on Cheyenne. It appears to be platform related. There is some simple Python code to detect if there are more than 1 chunks that is failing (no idea why). When I remove --nchunks_dst ${NCHUNKS_DST} from the weight generation call then weights are generated. Will keep you posted.

@bekozi
Copy link

bekozi commented Dec 13, 2019

Found the issue. I should have spotted it sooner. The run target variable CRWG="ocli chunked-rwg" should be CRWG="python /glade/work/slevis/git_ocgis/ocgis/src/ocgis/ocli.py chunked-rwg". Since we are using development code, we need to spawn the CLI directly from the Python file. Without the full path, the script is picking up the Cheyenne-installed ocli which is older than the dev version.

The script creates the weight file now. Most of the other changes were not in the CLI Python file which explains why past patches were picked up by your testing.

@slevis-lmwg
Copy link
Contributor

Thank you, @bekozi .

@slevis-lmwg
Copy link
Contributor

As far as I can tell, the weight files (aka map_ files) for regional/point destination grids look correct now other than the indexing issue

@ekluzek ekluzek added the closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix label Apr 14, 2022
@ekluzek
Copy link
Collaborator Author

ekluzek commented Apr 14, 2022

This becomes obsolete with PR #1663. So closing as a wontfix.

@ekluzek ekluzek closed this as completed Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked: dependency Wait to work on this until dependency is resolved closed: wontfix We won't fix this issue, because it would be too difficult and/or isn't important enough to fix enhancement new capability or improved behavior of existing capability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants