Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erroneous mapping involving HDMAlk ROF grid #339

Closed
nmizukami opened this issue Jan 26, 2023 · 47 comments
Closed

erroneous mapping involving HDMAlk ROF grid #339

nmizukami opened this issue Jan 26, 2023 · 47 comments
Labels
bug cesm-coupling For cesm coupling high priority Need immediate attention and fix

Comments

@nmizukami
Copy link
Collaborator

nmizukami commented Jan 26, 2023

This is only for CTSM-mizuRoute, not applicable to standalone mizuRoute

ROF grid- rHDMAlk (HDMA-lake) cause erroneous runoff mapping. large scale pattern seems to be captured, but lots of speckles.
See examples of remapped runoff (1 month mean runoff) below.

  • 1.0 degree CTSM grid (f09_f09) -> HDMA-lake (rHDMAlk)
    mizuRoute_f09_f09_mg17_rHDMAlk_ro_catch_lake_01

  • 1.0 degree CTSM grid (f09_f09) -> HDMA (rHDMA)
    mizuRoute_f09_f09_mg17_rHDMA_ro_catch_no_lake_01


  • 2 degree CTSM grid (f19_f19) -> HDMA-lake (rHDMAlk)
    mizuRoute_f19_f19_mg17_rHDMAlk_ro_catch_lake_01

  • 2 degree CTSM grid (f19_f19) -> HDMA (rHDMA)
    mizuRoute_f19_f19_mg17_rHDMA_ro_catch_no_lake_01

@nmizukami nmizukami added bug cesm-coupling For cesm coupling high priority Need immediate attention and fix labels Jan 26, 2023
@nmizukami
Copy link
Collaborator Author

nmizukami commented Feb 23, 2023

We also set up smaller regional test case (Pacific Northwest) by subsetting HDMAlk grid (including lake) and HDMA (no lake) and show similar incorrect runoff mapping to global for HDMAlk grid. Larger scale pattern is correct, but there are some random hrus where mapping is not correct.

HDMA remapped HRU runoff:
Screen Shot 2023-02-28 at 2 58 03 PM

HDMAlk remapped HRU runoff:
Screen Shot 2023-02-28 at 2 57 46 PM

Note: For both ROF grids, we pre-computed ESMF weight file and centroid square mesh (to provide center coordinate of HRUs) as inputs to NOPC to remap CLM runoff to mizuRoute HRUs in the same ways.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 1, 2023

I brought this up with the CSEG group yesterday and we discussed it. The ESMF folks are interested in helping us out, if it looks like it's an ESMF problem.

These are the notes from that meeting on this issue...

The mapping files are created with an offline tool (not ESMF). It works for Naoki in an offline setup (which doesn't use CMEPS / ESMF), but gives weird patterns when running in CMEPS / ESMF.

Bill: Could you try using the check_maps tool to determine if there is a problem in the mapping file itself? Erik will try this, and then if it seems like the mapping file is fine, will reach out to Bob Oehmke for help."

I asked about what check_maps does, and it sends several analytic waveforms through and uses the mapping file to remap them. It can then assess what the errors are, since it's an analytic solution.. A couple people asked about looking at the actual mapping files so I'll add those in at some point.

@nmizukami
Copy link
Collaborator Author

nmizukami commented Mar 1, 2023

ESMF mapping file is located:

HDMA-lake over Pacific Northwest region (not working)
/glade/work/mizukami/mizuRoute_data/ctsm_test/hdma-lake_pnw/map_fv0.9x1.25_pnw_TO_HDMA-lake_pnw.nc (from land to river), map_HDMA-lake_pnw_TO_fv0.9x1.25_pnw.nc (river to land)

HDMA over Pacific Northwest region (works fine)
/glade/work/mizukami/mizuRoute_data/ctsm_test/hdma_pnw/map_fv0.9x1.25_pnw_TO_HDMA_pnw.nc (from land to river), map_HDMA_pnw_TO_fv0.9x1.25_pnw.nc(river to land)

potential problems

  1. mapping file itself
  2. ESMF/CESM reading mapping file
  3. ESMF mesh (cetroid square mesh)

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 1, 2023

In trying check_maps on the above two mapping files it fails with an error as follows. I get a similar error for some of the other mapping files that do work as well. Some of the simpler grids do work however.

./check_map.sh -v /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma_pnw/map_fv0.9x1.25_pnw_TO_HDMA_pnw.nc /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma-lake_pnw/map_fv0.9x1.25_pnw_TO_HDMA-lake_pnw.nc
Wed Mar  1 16:05:52 MST 2023
1: /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma_pnw/map_fv0.9x1.25_pnw_TO_HDMA_pnw.nc
 ESMF_Initialized!
 ESMF_LogSet!
 ESMF_UtilGetArg!
 opening 
 /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma_pnw/map_fv0.9x1.25_pnw_TO_HD
 MA_pnw.nc
 NCFileInquire!
 NCFileInquire done
   src grid size =           13          14
   dst grid size =         1827           1
 GridReadCoords!
 GridReadCoords done
 Test            1 : Initial test -- 2 + cos^2(lat) * cos(2 * lon)
 Test            2 : Constant test -- 2
 Test            3 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
           8 * lon)
 Test            4 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
          16 * lon)
 Test            5 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
          32 * lon)
 Test Fields Setup done
 using local smm
           6 : ESMF_FieldRegridReadSCRIPFileP!
 number of weights =         2950
 writing ./test_map_fv0.9x1.25_pnw_TO_HDMA_pnw.nc
analyzing field    1 of    5
 ERROR: the test did not successfully map any values 
 from the source grid to the destination grid
           0  of            0  tests failed. See above for details.
-----
2: /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma-lake_pnw/map_fv0.9x1.25_pnw_TO_HDMA-lake_pnw.nc
 ESMF_Initialized!
 ESMF_LogSet!
 ESMF_UtilGetArg!
 opening 
 /glade/work/mizukami/mizuRoute_data/ctsm_test/hdma-lake_pnw/map_fv0.9x1.25_pnw_
 TO_HDMA-lake_pnw.nc
 NCFileInquire!
 NCFileInquire done
   src grid size =           13          14
   dst grid size =         1868           1
 GridReadCoords!
 GridReadCoords done
 Test            1 : Initial test -- 2 + cos^2(lat) * cos(2 * lon)
 Test            2 : Constant test -- 2
 Test            3 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
           8 * lon)
 Test            4 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
          16 * lon)
 Test            5 : Higher Wave Number  test -- 2 + cos^2(lat) * cos(
          32 * lon)
 Test Fields Setup done
 using local smm
           6 : ESMF_FieldRegridReadSCRIPFileP!
 number of weights =         3012
 writing ./test_map_fv0.9x1.25_pnw_TO_HDMA-lake_pnw.nc
analyzing field    1 of    5
 ERROR: the test did not successfully map any values 
 from the source grid to the destination grid
           0  of            0  tests failed. See above for details.
-----

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 1, 2023

Of the mapping files we have already created the mapping files that the check_maps will work on are the following (just considering the ROF to LND maps):

/glade/p/cesmdata/cseg/inputdata/rof/mizuRoute/gridmaps/map_HDMAmz_5x5_amazon_TO_5x5_amazon_aave.201028.nc /glade/p/cesmdata/cseg/inputdata/rof/mizuRoute/gridmaps/map_HDMAmz_CONUS_TO_0.125nldas2_aave.201104.nc /glade/p/cesmdata/cseg/inputdata/rof/mizuRoute/gridmaps/map_HDMAmz_TO_f19_aave.200901.nc /glade/p/cesmdata/cseg/inputdata/rof/mizuRoute/gridmaps/map_USGS_GFmz_TO_0.125nldas2_aave.201028.nc /glade/p/cesmdata/cseg/inputdata/rof/mizuRoute/gridmaps/map_USGS_GFmz_TO_f19_aave.201002.nc

The script does say that some of it's tests fail, but they at least run.

@billsacks
Copy link
Member

I haven't used check_maps in a long time, so I'm not sure off-hand what the errors you're getting indicate. As for the failures in the other files, if the failures are due to exceeding tolerances and it doesn't look like the errors are too large, I think that can sometimes be acceptable: If I remember correctly (from conversations with @mnlevy1981) the tolerances were derived from some trial and error, and can sometimes be too tight, e.g., when the source and destination grids have very different resolutions.

@oehmke
Copy link

oehmke commented Mar 2, 2023

I looked at the mapping files and didn't see anything obviously wrong, but often it's hard to tell that just by inspection. If this is working for another application using these weight files, the problem could just be an inconsistency in how the weight files are being interpreted on both sides. One approach to figuring this out would be to choose one of the "bad" cells and then track it through and figure out why that particular one is bad. E.g. look at the weight file and see what it's saying about that cell, etc.

It looks like there are cells that are supposed to be 0.0 in the destination, but aren't. Is that true? Can you tell me the id of one of those cells. I could look at at what the weight file says about that cell and that might give a clue.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 2, 2023

@oehmke thanks for the suggestions.

One clarification is that standalone mizuRoute uses mapping files that these mapping files were created from. So they aren't the same files -- but these files were created from the ones that work for standalone mizuRoute. Which means one possibility is there is some error in the conversion process. But, the conversion process works for other cases, so we aren't sure what's different about these cases or why it works there and fails here.

And yes there are some cells that should be zero in the destination grid because it's outside of the CTSM domain. @nmizukami can you give out some ID's of a cell that's like that?

@mnlevy1981
Copy link

If I remember correctly (from conversations with @mnlevy1981) the tolerances were derived from some trial and error, and can sometimes be too tight

That's right -- the errors should get larger as either grid gets coarser, and the tolerances were basically set with the assumption that the maps CESM used in 2012 when we were developing the tool should all pass... so anything coarser than some of the coarsest maps from that period of time will probably "fail" even if the maps are doing exactly what is advertised. The exception is that the conservation check applied to aave maps should pass regardless of resolution.

Also, I don't really know what to expect from check_maps with regional maps. It may have some assumptions about both grids being global built in.

@mnlevy1981
Copy link

One clarification is that standalone mizuRoute uses mapping files that these mapping files were created from. So they aren't the same files -- but these files were created from the ones that work for standalone mizuRoute

So the pairs of plots in the initial post were comparing a mizuRoute mapping file and a CESM mapping file generated from that mapping file? Could you make a similar pair of plots using a case where you were happy with the CESM mapping file generated from the mizuRoute file? And have you provided links to the mizuRoute files? Maybe comparing the structure of the mizuRoute file that results in a bad CESM map to the structure of a file the results in a good map would be informative.

@nmizukami
Copy link
Collaborator Author

Thank you for the thoughts. Yes, I can identify some ID of HRU and associated grid cells.

@mnlevy1981, my initial plots are all from remapped runoff in CESM (CTSM) using ESMF mapping files (which are converted from stand-alone mizuRoute mapping files). I have not posted any stand-alone mizuRoute remapped runoff here.

So we have two mizuRoute grid files - one is catchment polygon only (which remap correctly in CESM), and another one is catchments + lake polygons (which does not). All the stand-alone remapped runoff (for both grids) look correct.

@nmizukami
Copy link
Collaborator Author

Hi @ekluzek, I just noticed that the mesh file you created (/glade/work/mizukami/mizuRoute_data/ctsm_test/hdma-lake_pnw/ESMF_mesh_14x13pt_PacificNWUS_c230215.nc) is shifted by one grid box in latitude direction from the domain I used to create standalone weight file (/glade/work/mizukami/py_mapping/HDMA/fv0.9x1.25_pnw/fv0.9x1.25_pnw.nc).

The Domain I subset (fv0.9x1.25_pnw.nc) and its SW and NE corner looks like this:
Screen Shot 2023-03-03 at 11 07 53 AM

But the image from mesh file is
Screen Shot 2023-03-03 at 11 08 43 AM

you can see the great salt lake is at the bottom row in my domain, but in one above from the bottom in the mesh.

I don't think this causes the wrong mapping (because runoff from CLM is mapped correctly to HDMA PNW (no lake) grid).

@oehmke
Copy link

oehmke commented Mar 3, 2023

Could it be that the wrong version of the lake mesh is being used in the code that isn't giving the right results, but the correct mesh is being used for the code that is working? Having the wrong source or destination mesh for a given set of weights could produce strange results.

@nmizukami
Copy link
Collaborator Author

nmizukami commented Mar 3, 2023

Hi Bob, So little complicated to explain this, but... the stand-alone mapping file does not use the grid box coordinates, but just use indices of lat and lon grid (==the domain I created, not domain from Mesh Erik created) to locate which grid box(es) are overlapped. When I convert it to the ESMF mapping file, I took both CLM mesh (this is one row shifted) and HRU mesh (this is correct) to insert center coordinates (so mesh coordinate and mapping coordinates are consistent). So the CLM grid boxes picked by latitude index is one south below. As a result, the remapped runoff looks right, but if you look carefully, it look shifted by one grid box south.

It looks like there are cells that are supposed to be 0.0 in the destination, but aren't. Is that true? Can you tell me the id of one of those cells. I could look at at what the weight file says about that cell and that might give a clue.

So, looks like all HRUs are overlapped by gridbox at least partially. so I expect all the HRUs have some valid positive values.

I still think it is good to track one bad HRU and see which grid boxes are used to computed remapped values. not sure how we can do?

Here is one bad HRU:
This one is covered by only one grid box (so should be identical to the runoff value at that box)
HRU ID: 3017305.
HRU center coordinate: 241.63747004970298, 49.38593562863172
Source grid box center coordinate: 241.25, 48.53403091430664

This is a good HRU (located to next to the bad one):
HRU ID: 3019378.
HRU center coordinate: 241.4715637657551, 49.414995526782526
Source grid box center coordinate: 241.25, 48.53403091430664

@oehmke
Copy link

oehmke commented Mar 4, 2023 via email

@nmizukami
Copy link
Collaborator Author

Thank you Bob, yes, I think that would be helpful. I also put one good HRU (3019378) just located adjacent to the bad one (3017305). Both are supposed to be coming from the same source grid box. have a good weekend too.

@oehmke
Copy link

oehmke commented Mar 8, 2023 via email

@nmizukami
Copy link
Collaborator Author

Thank you Bob for looking into this. Yes, my question is also how the mapping file (src and dst locations) and mesh interact. As far as I have seen, there is no errors in both. One thing I am always careful about is the order of dest and src coordinate in mapping file is the same as Mesh.

@oehmke
Copy link

oehmke commented Mar 10, 2023 via email

@nmizukami
Copy link
Collaborator Author

nmizukami commented Mar 10, 2023

Hi Bob, thank you for the suggestion.

Erik, It does not looks like there are any calls for ESMF_FieldRegrid in mizuRoute nuopc f90 codes ($CTSM/components/mizuRoute/route/build/cpl). wondering if this call is somewhere outside the mizuRoute. If it is too complicated to access, I could add ESMF_FieldRegrid somewhere in mizuRoute nuopc. Maybe a place where the mapping file is read, but I could not find it (where is mapping file read??). Mesh is read here

@oehmke
Copy link

oehmke commented Mar 15, 2023

Instead of ESMF_FieldRegrid() it could also be ESMF_FieldSMM() (both apply a routehandle created from mapping weights). However, my goal is actually to just write the source Field and destination Field after applying the weights, so if you just want to add the ESMF_FieldWriteVTK() calls someplace after you know that has happened, that works too.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 15, 2023

@nmizukami these calls are going to be inside of CMEPS. So we need to search inside the CMEPS external under components/cmeps. I'm looking at it to find the specific calls. But, there is a med_map_mod.F90 file that likely has all the remapping encapsulated inside of it. I'm going to try to find the specific call that covers LND to ROF mapping.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 15, 2023

I think this is where the ESMF_FieldRegridCreate gets called from:

https://github.com/ESCOMP/CMEPS/blob/main/mediator/med_phases_prep_rof_mod.F90#L169

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 15, 2023

@oehmke we created a branch of CMEPS to do your suggestion. It seems to be having trouble finding ESMF_FieldWriteVTK. And I couldn't find documentation for that subroutine in ESMF documentation. Is it perhaps a different subroutine name you were thinking of?

@nmizukami
Copy link
Collaborator Author

Just follow up.....

we just added these lines

call ESMF_FieldBundleGet(FBlndAccum2rof_l, "Flrl_rofsur", field=lfield, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return

call ESMF_FieldWriteVTK(lfield, fileName=“srcField”)

and got compiling error like:

error #5078: Unrecognized token '=?' skipped
error #6404: This name does not have a type, and must have an explicit type.   [FILENAME]

so we thought the arguments are inconsistent?

@billsacks
Copy link
Member

This may be missing the mark, but from your pasted text, there are curly quotes (unicode character) around srcField; could this be responsible for the first error you're getting?

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 15, 2023

@billsacks that is definitely a problem. Thanks for noticing that!

@nmizukami I committed that change and pushed it to the branch.

@oehmke
Copy link

oehmke commented Mar 15, 2023 via email

@nmizukami
Copy link
Collaborator Author

thanks everyone. we did not notice curly quotes... yes, it now compiled. I will run and then post the results.

@nmizukami
Copy link
Collaborator Author

Ok, Now I have created vtk files from each PET, but not sure how the content is interpreted... just googling around, cannot find more information on this. any suggestions??

@oehmke
Copy link

oehmke commented Mar 16, 2023 via email

@nmizukami
Copy link
Collaborator Author

Hi Bob, thank you! They are under /glade/scratch/mizukami/HDMA-lake_pnw/run

@oehmke
Copy link

oehmke commented Mar 16, 2023 via email

@oehmke
Copy link

oehmke commented Mar 18, 2023 via email

@nmizukami
Copy link
Collaborator Author

Hi Bob, thank you for the update! and also thank you for taking time looking at this. Have a good weekend!

@oehmke
Copy link

oehmke commented Mar 20, 2023 via email

@nmizukami
Copy link
Collaborator Author

Thanks Bob,

I believe the bad location is one-to-one corresponding at least LND to ROF direction.

It’s strange because the dst value isn’t 0

I expect the src values should be equal to dst value given the weight is one (so should not be 0?). A hru is 100% covered by one particular grid box? so yes, src value * src weight is supposed to be equal to dst value (and that is not happening).

Also the src value is surface runoff (so it can be zero) I believe.

@oehmke
Copy link

oehmke commented Mar 21, 2023 via email

@nmizukami
Copy link
Collaborator Author

Hi Bob,

We put ESMF_FieldWriteVKT() here in CMEP (ekluzek/CMEPS@3f22706).

There are several LND variables imported to ROF and they are bundled. So we don't see direct call for ESMF_FieldRegrid() nor ESMF_FieldSMM() here, but believed that call med_map_field_packed include either call. might need to trace this subroutine??

@oehmke
Copy link

oehmke commented Mar 21, 2023 via email

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 22, 2023

@oehmke thanks for the hints on tracking this down. @nmizukami and I just went through the driver code to track down what was happening and discovered what is undoubtedly "the issue". It looks like we had set the mapping file in the case, but there is a bug in CMEPS so that file was not used! So it was doing a screwy mapping from the CTSM grid, to our mesh with the little squares around the center points of the HRU's. It shows up as wrong for the lake cases since the center points can be displaced from the center of the lake HRU. It wasn't right in either case, but is more noticeably wrong in the lake case.

So I'm going to file some issues and get a bug fix for CMEPS which we'll try out and make sure is working correctly.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 22, 2023

The CMEPS issue is:

ESCOMP/CMEPS#346

@oehmke
Copy link

oehmke commented Mar 22, 2023 via email

@nmizukami
Copy link
Collaborator Author

Yes, it was great to find the bug, but one question I just started wondering about is:

Bob identified the source and target value for one bad case.

src value = 8.93301e-07
src weight = 1.0
dst value = 1.15568e-05

when cmeps use a square box around the centroid to pick up the src value, dst value is supposed to be the same as src value given src weight=1?

@oehmke
Copy link

oehmke commented Mar 23, 2023 via email

@nmizukami
Copy link
Collaborator Author

Ok! good news! Erik fixed CMEPS so it can read the mapping files and use them for remapping runoff, and now it maps the CTSM runoff to ROF catchment correctly.
Screen Shot 2023-03-29 at 4 04 47 PM

@ekluzek
Copy link
Collaborator

ekluzek commented Aug 9, 2023

Working now.

@ekluzek ekluzek closed this as completed Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cesm-coupling For cesm coupling high priority Need immediate attention and fix
Projects
None yet
Development

No branches or pull requests

5 participants