Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand write grid component to output moving nested domain properly #1053

Closed
BinLiu-NOAA opened this issue Feb 16, 2022 · 42 comments
Closed
Assignees
Labels
enhancement New feature or request

Comments

@BinLiu-NOAA
Copy link
Contributor

BinLiu-NOAA commented Feb 16, 2022

Description

With the effort to resolve issue #555 through PR #1044, now the write grid component within ufs-weather-model/FV3atm can properly output both parent and static nested domains/grids. However, when outputting a moving nest domain/grid, the output fields are on the wrong locations, due to the fact the forecast grid component's compute domain moves during the model integration (e.g., following a tropical cyclone).

Solution

Discussions among the ESMF team, EMC EIB and hurricane project team, AOML/HRD collaborators have figured out a strategy to expand the current write grid component capability to further support moving nest output. Work is ongoing to implement this capability from the ESMF team, with assistance from other groups mentioned above.

Alternatives

N/A

Related to

Directly reference any issues or PRs in this or other repositories that this is related to, and describe how they are related.

@theurich
Copy link
Collaborator

Tiny issue in the description, there is a typo in ESMF that should be corrected. I don't think I am able to edit the description.

@theurich
Copy link
Collaborator

@BinLiu-NOAA - how can I access the updated grid coordinates for moving nests from the module_fcst_grid_comp.F90 level? I tried Atmos%lon(:,:), Atmos%lat(:,:), Atmos%lon_bnd(:,:), and Atmos%lat_bnd(:,:), but these do not seem to be changing throughout the run. This is for a domain that has Atmos%is_moving_nest==.true., so it should be moving (and I do see the data shifting). Are the update coordinate arrays available?

@theurich
Copy link
Collaborator

@BinLiu-NOAA - I think I figured it out. Under subroutine update_atmos_model_state() I added calls into atmosphere_grid_bdry() and atmosphere_grid_ctr(). I see that those routines are needed to update the coord arrays in the Atmos structure.
I now see the expected moving nest in the output now! I will do a bit of code clean-up and then push this to a branch and let you know. I will also point you to the ESMF library I am using on Hera to test this out... since we do not have an ESMF snapshot ready yet that contains the required changes.

@BinLiu-NOAA
Copy link
Contributor Author

Tiny issue in the description, there is a typo in ESMF that should be corrected. I don't think I am able to edit the description.

@theurich, Good catch! Fixed!

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA - I think I figured it out. Under subroutine update_atmos_model_state() I added calls into atmosphere_grid_bdry() and atmosphere_grid_ctr(). I see that those routines are needed to update the coord arrays in the Atmos structure. I now see the expected moving nest in the output now! I will do a bit of code clean-up and then push this to a branch and let you know. I will also point you to the ESMF library I am using on Hera to test this out... since we do not have an ESMF snapshot ready yet that contains the required changes.

@theurich, Great that you figured it out. And Thanks for the updates as well! And feel free to let us know once you have an version of ESMF and code branch for us to test. Have a nice weekend!

@theurich
Copy link
Collaborator

@BinLiu-NOAA - I just pushed the initial moving nest output changes to https://github.com/hafs-community/fv3atm/tree/feature/hafs_nest_cpl_moving. Running this currently requires use of ESMF installation under /scratch1/NCEPDEV/nems/Gerhard.Theurich/Moving-Nests-HAFS-1/esmf/lib/libO/Linux.intel.64.intelmpi.default on Hera.

@junwang-noaa
Copy link
Collaborator

With the code changes, there is issue with GOCART as all the GOCART tests failed. Here is some conversation with ESMF group:

From Jun:
<I am looking at the MAPL error in HAFS PR #1104, it seems like we defined the forecast grid differently in fv3atm. The error message in the PR message shows:

105: pe=00105 FAIL at line=02371 Base_Base_implementation.F90 <status=509>

The line 2371 in Base_Base_implementation.F90 is this:

https://github.com/GEOS-ESM/MAPL/blob/v2.11.0/base/Base/Base_Base_implementation.F90#L2371

The related code is:
call ESMF_GridGetCoord(grid,localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, &
farrayPtr=corner, rc=status)
imc=size(corner,1)
jmc=size(corner,2)
allocate(ptr(0:imc+1,0:jmc+1),source=0.0d0,stat=status)
_VERIFY(status)
field = ESMF_FieldCreate(grid,ptr,staggerLoc=ESMF_STAGGERLOC_CORNER,totalLWidth=[1,1],totalUWidth=[1,1],rc=status)
_VERIFY(status)

Would the ESMF bundle definition require changes in the MAPL call? Thanks

From Gerhard/Rocky:

I think that the HAFS issue probably has to do with the fact that instead of creating separate allocations for the Grid coords we are now directly pointing to the allocations that the ATMOS structure (not 100% sure on the name, but I think it is) is providing. This way when the coord values in that structure change, the ESMF Grid coords also automatically change, without the need for constant copying updated values. Plus since already had switched to the internal radians (from previous degrees), this was straight forward. I cannot remember if we had to change the indexflag from DE_LOCAL to maybe GLOBAL, or the other way around. Maybe that is the issue, I don't know, but I suspect something like that.

From Jun
<My understanding is that the issue is not related to write grid component, but the forecast grid defined in forecast grid component is shared by the GOCART grid comp. Also the global forecast grid does not change, so allocation of the atmos structure should not be an issue. I am more worried on forecast grid definition change, as that may require changes in GOCART side since it is accessing the same grid. Bin also started an email chain with Dan. Let's move the discussion there. Thanks.

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 4, 2022

@danrosen25 @theurich I am able to get same error code on fv3atm side when I made the same calls after the field creation. Following changes are added inside subroutine SetServicesNest after line 256:

https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/module_fcst_grid_comp.F90#L256

!jwtest:
call ESMF_GridGetCoord(grid,localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, &
farrayPtr=corner, rc=rc)
print *,'in fcst grid comp, getcoord, rc=',rc
imc=size(corner,1)
jmc=size(corner,2)
allocate(ptr(0:imc+1,0:jmc+1),source=0.0d0,stat=rc)
print *,'in fcst grid comp, imc=',imc,jmc
testfield = ESMF_FieldCreate(grid,ptr,staggerLoc=ESMF_STAGGERLOC_CORNER,totalLWidth=[1,1],totalUWidth=[1,1],rc=rc)
print *,'in fcst grid comp, testfield, rc=',rc

The line in bold also gave the ESMF error code 509, the same error code showed in the MAPL Base_Base_implementation.F90.

@junwang-noaa
Copy link
Collaborator

With the current ufs-weather-model develop branch, the same code is added after the global forecast grid:

https://github.com/NOAA-EMC/fv3atm/blob/develop/module_fcst_grid_comp.F90#L709

!jwtest:
if (n==1 .and. top_parent_is_global) then
call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, &
farrayPtr=corner, rc=rc)
print *,'in fcst grid comp, getcoord, rc=',rc
imc=size(corner,1)
jmc=size(corner,2)
allocate(ptr(0:imc+1,0:jmc+1),source=0.0d0,stat=rc)
print *,'in fcst grid comp, imc=',imc,jmc
testfield = ESMF_FieldCreate(fcstGrid(1),ptr,staggerLoc=ESMF_STAGGERLOC_CORNER,totalLWidth=[1,1],totalUWidth=[1,1],rc=rc)
print *,'in fcst grid comp, testfield, rc=',rc
endif

0: in fcst grid comp, getcoord, rc= 0
0: in fcst grid comp, imc= 32 12
0: in fcst grid comp, testfield, rc= 0

From HAFS branch:
0: in fcst grid comp, getcoord, rc= 0
0: in fcst grid comp, imc= 33 13
0: in fcst grid comp, testfield, rc= 509

It looks to me the forecast grid in hafs branch has different corner dimensions from the grid defined in current fv3atm develop branch.

@danrosen25
Copy link
Collaborator

@junwang-noaa @theurich
Is there a change to the ESMF version being used?

@junwang-noaa
Copy link
Collaborator

@danrosen25 We are using the latest ESMF 830bs09. The same version works with the current develop branch including the GOCART tests. The differences are on the fv3 forecast grid, where the grid is defined directly with dycore array using ESMF_array in hafs branch, while the develop branch gets the dimensions from the grid and set up the coords with the values of dycore arrays using those dimensions. I am updating the code to use the way develop branch uses, will report back if it fixes the issue.

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 4, 2022

@danrosen25 @theurich After changing the forecast grid definition using the ESMF coord dimensions, the fv3-gocart test is working. I will run the hafs tests to see if there is any impact. I am curious what is the reason that we use the ESMF array set up from dycore array, instead of getting coord dimension from the ESMF grid.

git diff module_fcst_grid_comp.F90
+    integer               :: tlb(2), tub(2)
+    real(ESMF_KIND_R8),dimension(:,:), pointer :: latPtr, lonPtr
..
-
-    call ESMF_GridGet(grid, staggerloc=ESMF_STAGGERLOC_CENTER, distgrid=distgrid, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    array = ESMF_ArrayCreate(distgrid, farray=Atmos%lon, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    call ESMF_GridSetCoord(grid, coordDim=1, staggerLoc=ESMF_STAGGERLOC_CENTER, array=array, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    array = ESMF_ArrayCreate(distgrid, farray=Atmos%lat, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    call ESMF_GridSetCoord(grid, coordDim=2, staggerLoc=ESMF_STAGGERLOC_CENTER, array=array, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    call ESMF_GridGet(grid, staggerloc=ESMF_STAGGERLOC_CORNER, distgrid=distgrid, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    array = ESMF_ArrayCreate(distgrid, farray=Atmos%lon_bnd, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    call ESMF_GridSetCoord(grid, coordDim=1, staggerLoc=ESMF_STAGGERLOC_CORNER, array=array, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    array = ESMF_ArrayCreate(distgrid, farray=Atmos%lat_bnd, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
-
-    call ESMF_GridSetCoord(grid, coordDim=2, staggerLoc=ESMF_STAGGERLOC_CORNER, array=array, rc=rc)
-    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return
+    call ESMF_GridAddCoord(grid, staggerLoc=ESMF_STAGGERLOC_CENTER, rc=rc); ESMF_ERR_ABORT(rc)
+    call ESMF_GridAddCoord(grid, staggerLoc=ESMF_STAGGERLOC_CORNER, rc=rc); ESMF_ERR_ABORT(rc)
+
+    ! define "center" coordinate values
+    call ESMF_GridGetCoord(grid, coordDim=1, staggerLoc=ESMF_STAGGERLOC_CENTER, &
+                           totalLBound=tlb, totalUBound=tub, &
+                           farrayPtr=lonPtr, rc=rc); ESMF_ERR_ABORT(rc)
+    lonPtr(tlb(1):tub(1),tlb(2):tub(2)) = Atmos%lon(tlb(1):tub(1),tlb(2):tub(2))
+
+    call ESMF_GridGetCoord(grid, coordDim=2, staggerLoc=ESMF_STAGGERLOC_CENTER, &
+                           totalLBound=tlb, totalUBound=tub, &
+                           farrayPtr=latPtr, rc=rc); ESMF_ERR_ABORT(rc)
+    latPtr(tlb(1):tub(1),tlb(2):tub(2)) = Atmos%lat(tlb(1):tub(1),tlb(2):tub(2))
+
+    ! define "corner" coordinate values
+    call ESMF_GridGetCoord(grid, coordDim=1, staggerLoc=ESMF_STAGGERLOC_CORNER, &
+                           totalLBound=tlb, totalUBound=tub, &
+                           farrayPtr=lonPtr, rc=rc); ESMF_ERR_ABORT(rc)
+    lonPtr(tlb(1):tub(1),tlb(2):tub(2)) = Atmos%lon_bnd(tlb(1):tub(1),tlb(2):tub(2))
+    print *,'in define corner,tlb=',tlb(1),tlb(2),'tub=',tub(1),tub(2)
+
+    call ESMF_GridGetCoord(grid, coordDim=2, staggerLoc=ESMF_STAGGERLOC_CORNER, &
+                           totalLBound=tlb, totalUBound=tub, &
+                           farrayPtr=latPtr, rc=rc); ESMF_ERR_ABORT(rc)
+    latPtr(tlb(1):tub(1),tlb(2):tub(2)) = Atmos%lat_bnd(tlb(1):tub(1),tlb(2):tub(2))

@junwang-noaa
Copy link
Collaborator

@BinLiu-NOAA I ran all the coupled tests, one of gobal atmonly, one of the regional test and all the HAFS tests, these tests finished successfully. Would you please add the changes above to module_fcst_grid_comp.F90 in your hafs branch and rerun the full RT to see if there is any issue. You may also run some hafs moving nest cases to confirm it works as before. Thanks

@BinLiu-NOAA
Copy link
Contributor Author

@junwang-noaa, this is great news! And thanks for looking into this and providing a fix/workaround for this! If you would like to, you can commit your fix back to the feature/hafs_nest_cpl branch, for which you should already have the correct permission. I will then sync it up and conduct a full regression test on Hera and make sure all the RTs can go through. Will let you know how it goes later today or early tomorrow. Thanks!
P.S., Feel free to let me know if you would like me to commit your fix back to this feature/hafs_nest_cpl branch.

@BinLiu-NOAA
Copy link
Contributor Author

BinLiu-NOAA commented Apr 5, 2022

@junwang-noaa, I have committed your fix into the feature/hafs_nest_cpl branch for PR #1104 and also synced it with the latest develop branch. After that, I conducted the standard regression tests with hera.intel, and all the RTs run through. Beside, it confirmed that this PR only changes the baseline results for the related HAFS RTs, and does not change results for other RTs.

@BinLiu-NOAA
Copy link
Contributor Author

@junwang-noaa, I checked the moving-nest output netcdf files a little more carefully. And it looks like your above fix can run through, however the output grid lat/lon are not updated. And this is because in your fix, the disgrid lon/lat array were not created/pointed to Atmos%lon/lat structure. For example,
https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/module_fcst_grid_comp.F90#:~:text=!%20array%20%3D%20ESMF_ArrayCreate(distgrid%2C%20farray%3DAtmos%25lon,ESMF_LOGERR_PASSTHRU%2C%20line%3D__LINE__%2C%20file%3D__FILE__))%20return

    array = ESMF_ArrayCreate(distgrid, farray=Atmos%lon, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
    if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return

So, it looks like we need do the same thing for the distgrid with lat/lon and lat/lon_bnd.

Thanks!

@junwang-noaa
Copy link
Collaborator

@BinLiu-NOAA Thanks for the testing. I think another way is that we move the code updates within the global if statement and get the original code to nest grid if statement as the global domain coordinates will not change.

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA Thanks for the testing. I think another way is that we move the code updates within the global if statement and get the original code to nest grid if statement as the global domain coordinates will not change.

Thanks, @junwang-noaa! This sounds better, and I am testing this as you suggested.

@BinLiu-NOAA
Copy link
Contributor Author

@junwang-noaa, the above method (moving your code change into the global if condition section while using the distgrid method in the else (nested/regional) section) worked properly for the moving-nesting output grid with updated output grid lat/lon. I am conducting the full regression tests now to make sure and confirm it will still work for all other RTs (including the GOCART coupling ones). Will keep you posted on how it goes. Thanks!

In case you would like to take a look at the latest related code changes, please see starting from:
https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/module_fcst_grid_comp.F90#L193

@DusanJovic-NOAA
Copy link
Collaborator

@junwang-noaa, the above method (moving your code change into the global if condition section while using the distgrid method in the else (nested/regional) section) worked properly for the moving-nesting output grid with updated output grid lat/lon. I am conducting the full regression tests now to make sure and confirm it will still work for all other RTs (including the GOCART coupling ones). Will keep you posted on how it goes. Thanks!

In case you would like to take a look at the latest related code changes, please see starting from: https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/module_fcst_grid_comp.F90#L193

I do not like the fact that we use two different methods of setting the grid coordinates for different domains (global vs. non-global). It's inconsistent and confusing and complicates the code. Are the grid coordinate array bounds incorrect only for the global domain when distgrid is used?

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 5, 2022

@DusanJovic-NOAA I completely agree. I am puzzled how the HAFS code of defining an ESMF array with different coordinate dimensions can map to the internal ESMF grid coordinates. Also it looks to me that the previous way (the code I updated) only sets up the ESMF grid coordinates once, while using ESMF array to set up the coordinates, the coordinates will change when the internal array (atmos%lon(lat)_bnd) is changed, maybe a better way is to create two new arrays lon_bndesmf/lat_bndesmf in atmos with dimensions matching with the ESMF corner grid coordinates and update with lon_bnd/lat_bnd at each time step. @theurich @danrosen25 FYI.

@BinLiu-NOAA
Copy link
Contributor Author

@junwang-noaa, the above method (moving your code change into the global if condition section while using the distgrid method in the else (nested/regional) section) worked properly for the moving-nesting output grid with updated output grid lat/lon. I am conducting the full regression tests now to make sure and confirm it will still work for all other RTs (including the GOCART coupling ones). Will keep you posted on how it goes. Thanks!

In case you would like to take a look at the latest related code changes, please see starting from: https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/module_fcst_grid_comp.F90#L193

Just a quick follow-up, the full RTs passed on wcoss_dell and it confirmed only changes results for HAFS related RTs and does not changes other RT's baseline results.

@BinLiu-NOAA
Copy link
Contributor Author

BinLiu-NOAA commented Apr 5, 2022

@DusanJovic-NOAA I completely agree. I am puzzled how the HAFS code of defining an ESMF array with different coordinate dimensions can map to the internal ESMF grid coordinates. Also it looks to me that the previous way (the code I updated) only sets up the ESMF grid coordinates once, while using ESMF array to set up the coordinates, the coordinates will change when the internal array (atmos%lon(lat)_bnd) is changed, maybe a better way is to create two new arrays lon_bndesmf/lat_bndesmf in atmos with dimensions matching with the ESMF corner grid coordinates and update with lon_bnd/lat_bnd at each time step. @theurich @danrosen25 FYI.

@junwang-noaa and @DusanJovic-NOAA, it looks to me that the reason it did not work properly for the moving-nesting output is that inside the wrtgridcomp, it queries the moving nest domain coordinates from distgrid instead of from grid, for example the code sections after this line:
https://github.com/hafs-community/fv3atm/blob/feature/hafs_nest_cpl/io/module_wrt_grid_comp.F90#L1818

So, with this, I think if we attach/update both grid and distgrid's lon/lat[_bnd] for both global and nonglobal grids from the fcstgridcomp side, it should then work for both GOCART coupling and moving-nesting outputting.

Maybe another alternative method is to query grid's (instead of distgrid's) lon/lat coordinates from the wrtgridcomp side. But I am not sure if that will work properly for the moving-nest output grid.

Thanks!

@DusanJovic-NOAA
Copy link
Collaborator

@BinLiu-NOAA I'm not sure I understand what you are proposing. Distgrid does not hold coordinates (lat/lon values). Grid holds coordinates.
I also do not see where this is done, "it queries the moving nest domain coordinates from distgrid instead of from grid". Where exactly it queries coordinates from distgrid? I see it gets the coordinates from the grid. distgrid and grid are two completely different things.

@DusanJovic-NOAA
Copy link
Collaborator

One of the tests that failed (cpld_debug_p8) is C96 running on 3x8 layout, which means each task will have 32x12 grid points, ie. 32x12 center coordinates and 33x13 corner coordinates. Which is consistent with the results from "the HAFS branch" see this comment (#1053 (comment))

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA I'm not sure I understand what you are proposing. Distgrid does not hold coordinates (lat/lon values). Grid holds coordinates. I also do not see where this is done, "it queries the moving nest domain coordinates from distgrid instead of from grid". Where exactly it queries coordinates from distgrid? I see it gets the coordinates from the grid. distgrid and grid are two completely different things.

@DusanJovic-NOAA, I was referring to this code section (lines 1833 to 1842), which tries to query the distgrid's coordinate indices, which will then be used to determine the moving nest grid center.

Regarding Gerhard's distgrid method to attach Atmos%lon/lat[_bnd] to fcstGrid vs Jun's method to attach Atmos%lon/lat[_bnd] directly to fcstGrid, I am not sure I fully understand their differences. And not sure why the direct attaching method does not work properly for moving-nesting output grid, but the distgrid method works fine. But, anyway, @theurich probably knows the reason. Thanks!

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Apr 5, 2022

@BinLiu-NOAA I do not see anything to do with the write grid comp. The write grid comp only gets whatever forecast grids that the forecast grid comp defines and uses the index from the coordinates, it does not define the grid. From what I see, there is inconsistency with the ESMF grid corner coordinate definitions:

  1. In module_fcst_grid_comp.F90 in the current develop branch, we have:
 ! define "corner" coordinate values
          call ESMF_GridGetCoord(fcstGrid(n), coordDim=1, staggerLoc=ESMF_STAGGERLOC_CORNER, &
                                 totalLBound=tlb, totalUBound=tub, &
                                 farrayPtr=glonPtr, rc=rc); ESMF_ERR_ABORT(rc)
          glonPtr(tlb(1):tub(1),tlb(2):tub(2)) = Atmos%lon_bnd(tlb(1):tub(1),tlb(2):tub(2))

the totalLBound and totalUBound shows: 1:32 and 1:12 for C96

  1. in the module_fcst_grid_comp.F90 in your hafs branch, we have:
array = ESMF_ArrayCreate(distgrid, farray=Atmos%lon_bnd, indexflag=ESMF_INDEX_DELOCAL, rc=rc)
      if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return

      call ESMF_GridSetCoord(grid, coordDim=1, staggerLoc=ESMF_STAGGERLOC_CORNER, array=array, rc=rc)
      if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) return

The Atmos%lon_bnd has dimensions: 1:33 and 1:13.

So if we use 2), and we call the ESMF subroutine to get coordinates by:

call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, &
farrayPtr=corner, rc=rc)

it will give dimension 1:33 and 1:13. While if we use 1), we still get dimension 1:32 and 1:12. With 2) if we create a new field with the domain dimension which is (1:32,1:12), the error code 509 shows up in the ESMF_fieldcreate as seen in MAPL and in the test code in module_fcst_grid_comp.F90. We have no problem to create new fields with 1).

@DusanJovic-NOAA
Copy link
Collaborator

@BinLiu-NOAA I'm not sure I understand what you are proposing. Distgrid does not hold coordinates (lat/lon values). Grid holds coordinates. I also do not see where this is done, "it queries the moving nest domain coordinates from distgrid instead of from grid". Where exactly it queries coordinates from distgrid? I see it gets the coordinates from the grid. distgrid and grid are two completely different things.

@DusanJovic-NOAA, I was referring to this code section (lines 1833 to 1842), which tries to query the distgrid's coordinate indices, which will then be used to determine the moving nest grid center.

Regarding Gerhard's distgrid method to attach Atmos%lon/lat[_bnd] to fcstGrid vs Jun's method to attach Atmos%lon/lat[_bnd] directly to fcstGrid, I am not sure I fully understand their differences. And not sure why the direct attaching method does not work properly for moving-nesting output grid, but the distgrid method works fine. But, anyway, @theurich probably knows the reason. Thanks!

Sorry, I still do not understand your point about the code in the write grid component and distgrid's coordinate. distgrid has nothing to do with the coordinates i.e, latitude and longitude values. That block of code is looking at which task (PE) is holding the center index and then looks at that index in coordinate arrays, and those arrays are extracted from the grid, not distgrid.

Jun's method to attach Atmos%lon/lat[bnd] directly to fcstGrid, this method is not attaching anything to fcstGrid, it just copies the values from Atmos%lon, and that's done only once during the initialization, that's why it does not work for the moving nest.

@junwang-noaa
Copy link
Collaborator

Yes. We need ESMF group @theurich or @danrosen25 to help us understand how the atmos%lat_bnd and atmos%lonbnd are mapped to ESMF internal grid corner coordinates. Also since the interface was working with GOCART/MAPL ( I assume other GOCART host models have same ESMF grid coordinates as we have in the current fv3atm develop branch), I would suggest to check if the method 2) can be updated to match the ESMF grid coordinate dimensions.

@BinLiu-NOAA
Copy link
Contributor Author

Sorry, I still do not understand your point about the code in the write grid component and distgrid's coordinate. distgrid has nothing to do with the coordinates i.e, latitude and longitude values. That block of code is looking at which task (PE) is holding the center index and then looks at that index in coordinate arrays, and those arrays are extracted from the grid, not distgrid.

Jun's method to attach Atmos%lon/lat[bnd] directly to fcstGrid, this method is not attaching anything to fcstGrid, it just copies the values from Atmos%lon, and that's done only once during the initialization, that's why it does not work for the moving nest.

@DusanJovic-NOAA, never mind. I was just trying to understand the difference(s) between the distgrid method vs Jun's method to attach the Atmos%lon/lat[_bnd]. My understanding is that with the distgrid method, the array is created on the distgrid and with a pointer (farray) pointing to Atmos%lon (e.g., lines 256 to 259), thus the moving nest lon/lan will be automatically updated, passed to and accessible at the wrtgridcomp side. Jun's direct attaching method might not work properly in this case. But anyway, this is just my current understanding/guess, which might turn out to be totally wrong. Thanks!

@DusanJovic-NOAA
Copy link
Collaborator

Here's what I thing the issue is. Shape of Atmos%lon and Atmos%lat is (32,12) while shape of Atmos%lon_bnd and Atmos%lat_bnd is (33,13) on all PEs. So, on each PE corner arrays in Atmos type have one more element is each direction compared to center arrays. This does not correspond with distgrid for corner locations, for example, this is the size (computed as maxIndexPDe - minIndexPDe + 1) for first 24 tasks (3x8 layout of the first tile):

center distgrid:
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12

corner distrid:
32,13 32,13 33,13
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12

only the PEs on the ending edge of the tile have one extra element. This is why, I think, creating ESMF_Field at corner locations fails because the local array (ptr in the above example) is allocated based on a size of a farrayPtr extracted from the array added to the grid (all are the same size) which is not the same as the grid's distgrid sizes.

@junwang-noaa
Copy link
Collaborator

Yes, exactly. As you see the Ptr gets its dimensions from:

call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, &
farrayPtr=corner, rc=rc)

so above ESMF call seems giving different dimensions from the corner distgrid dimensions.

@BinLiu-NOAA
Copy link
Contributor Author

@junwang-noaa and @DusanJovic-NOAA,
My understanding might be wrong here. I thought the ESMF_STAGGERLOC_CORNER stagger should actually have one extra row/column on the last PE row/column (if defined in such a way). For example, what Dusan listed above looks reasonable to me.
center distgrid:
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
32,12 32,12 32,12
corner distrid:
32,13 32,13 33,13
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12
32,12 32,12 33,12

In this case, I was thinking that one should probably query the ESMF_STAGGERLOC_CENTER's array's dimension, then add 1 when allocating the corner array, which can then be used to create the ESMF_STAGGERLOC_CORNER field. For example, something like below based on your original version. Do you think this might work without the rc=509 error?

!jwtest:
if (n==1 .and. top_parent_is_global) then
call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CENTER, &
farrayPtr=center, rc=rc)
print *,'in fcst grid comp, getcoord, rc=',rc
imc=size(center,1)
jmc=size(center,2)
allocate(ptr(0:imc+1,0:jmc+1),source=0.0d0,stat=rc)
print *,'in fcst grid comp, imc=',imc,jmc
testfield = ESMF_FieldCreate(fcstGrid(1),ptr,staggerLoc=ESMF_STAGGERLOC_CORNER,totalLWidth=[1,1],totalUWidth=[1,1],rc=rc)
print *,'in fcst grid comp, testfield, rc=',rc
endif

Another maybe stupid question, is it possible to create the corner stagger field directly from the fcstGrid (without first allocating the array) and let ESMF_FieldCreate to allocate the array automatically/properly based on the staggerLoc=ESMF_STAGGERLOC_CORNER information?

Thanks!

@DusanJovic-NOAA
Copy link
Collaborator

DusanJovic-NOAA commented Apr 6, 2022

@junwang-noaa please take a look at the changes in this branch https://github.com/DusanJovic-NOAA/fv3atm/commits/feature/hafs_nest_cpl this commit DusanJovic-NOAA/fv3atm@a2010f8

@BinLiu-NOAA can you take the above branch and run full test. Thanks.

@junwang-noaa
Copy link
Collaborator

@DusanJovic-NOAA Yes, the code changes look good to me.

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA can you take the above branch and run full test. Thanks.

Thanks, @DusanJovic-NOAA! Sure, I am starting running the full regression tests with your branch now.

@BinLiu-NOAA
Copy link
Contributor Author

BinLiu-NOAA commented Apr 6, 2022

@BinLiu-NOAA can you take the above branch and run full test. Thanks.

Thanks, @DusanJovic-NOAA! Sure, I am starting running the full regression tests with your branch now.

Just a quick follow-up, the full regression tests finished on Hera with hera.intel. And here is the RT run dir:
/scratch1/NCEPDEV/stmp2/Bin.Liu/FV3_RT/rt_25258/
Again, it does not change baseline results except for the related HAFS RTs. And the moving-nesting output seems to work fine as well. Thanks!

@junwang-noaa
Copy link
Collaborator

@BinLiu-NOAA Where is your baseline? Can you check the results from moving nest output?

@BinLiu-NOAA
Copy link
Contributor Author

BinLiu-NOAA commented Apr 6, 2022

@junwang-noaa, I was using the existing develop branch baseline BL_20220401. Just to try to confirm this PR does not change the baseline results for all other RTs (except for the HAFS related regression tests). For HAFS moving-nesting output, I manually/visually checked the output netcdf files, and they look reasonable, with the output grid lat/lon updated properly (following the moving nest domain). Thanks!

@junwang-noaa
Copy link
Collaborator

@BinLiu-NOAA Thanks for confirming the HAFS tests work properly, let's get your PR back to the commit queue.

@BinLiu-NOAA
Copy link
Contributor Author

Thanks, @junwang-noaa! @DusanJovic-NOAA, if it works for you, I will merge your https://github.com/DusanJovic-NOAA/fv3atm/tree/feature/hafs_nest_cpl branch back into https://github.com/hafs-community/fv3atm/tree/feature/hafs_nest_cpl branch, and then go ahead sync with the latest develop branches to get the PR #1104 up to date. Thanks!

@DusanJovic-NOAA
Copy link
Collaborator

Merged in c2e6b22

JianpingHuang-NOAA added a commit to JianpingHuang-NOAA/ufs-weather-model that referenced this issue Apr 21, 2024
… EE2 compliance issues for AQMv7 implementation (ufs-community#1053)

* create a branch to update build.sh by removing hardcode with UPP
* correct daily 1hr and 8hr max o3 double-alerting issue for o3 bias correction script
* Update ecflow time trigger for 18Z
* fix the forecast job failure and reproducibility of results
* clean up var_defns.sh.template
* remove 0-size of data_table from parm and update hash numbers for ufs-weather-model with nowarn fix and ufs_util
* Fix model crash issus found with diag_table and add exception handling code in forecast manager
* removing scaling factors for fire emissions
* remove additional exec code from UFS_UTILS
* update aqm_bias_correction for O3
* replace cp with cpreq for exaqm scripts
* update ecflow scripts to address computer resource issues provided by NCO
* update fire_emission by removing scaling factors
* getting scaling factor back for RAVE v1 fire product
* udpate exaqm script to address AQM v7.0 failure mode testing issues

---------

Co-authored-by: Lin.Gan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants