-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand write grid component to output moving nested domain properly #1053
Comments
Tiny issue in the description, there is a typo in ESMF that should be corrected. I don't think I am able to edit the description. |
@BinLiu-NOAA - how can I access the updated grid coordinates for moving nests from the module_fcst_grid_comp.F90 level? I tried |
@BinLiu-NOAA - I think I figured it out. Under |
@theurich, Good catch! Fixed! |
@theurich, Great that you figured it out. And Thanks for the updates as well! And feel free to let us know once you have an version of ESMF and code branch for us to test. Have a nice weekend! |
@BinLiu-NOAA - I just pushed the initial moving nest output changes to https://github.com/hafs-community/fv3atm/tree/feature/hafs_nest_cpl_moving. Running this currently requires use of ESMF installation under |
With the code changes, there is issue with GOCART as all the GOCART tests failed. Here is some conversation with ESMF group: From Jun: 105: pe=00105 FAIL at line=02371 Base_Base_implementation.F90 <status=509> The line 2371 in Base_Base_implementation.F90 is this: https://github.com/GEOS-ESM/MAPL/blob/v2.11.0/base/Base/Base_Base_implementation.F90#L2371 The related code is: Would the ESMF bundle definition require changes in the MAPL call? Thanks From Gerhard/Rocky:
From Jun |
@danrosen25 @theurich I am able to get same error code on fv3atm side when I made the same calls after the field creation. Following changes are added inside subroutine SetServicesNest after line 256: !jwtest: The line in bold also gave the ESMF error code 509, the same error code showed in the MAPL Base_Base_implementation.F90. |
With the current ufs-weather-model develop branch, the same code is added after the global forecast grid: https://github.com/NOAA-EMC/fv3atm/blob/develop/module_fcst_grid_comp.F90#L709 !jwtest: 0: in fcst grid comp, getcoord, rc= 0 From HAFS branch: It looks to me the forecast grid in hafs branch has different corner dimensions from the grid defined in current fv3atm develop branch. |
@junwang-noaa @theurich |
@danrosen25 We are using the latest ESMF 830bs09. The same version works with the current develop branch including the GOCART tests. The differences are on the fv3 forecast grid, where the grid is defined directly with dycore array using ESMF_array in hafs branch, while the develop branch gets the dimensions from the grid and set up the coords with the values of dycore arrays using those dimensions. I am updating the code to use the way develop branch uses, will report back if it fixes the issue. |
@danrosen25 @theurich After changing the forecast grid definition using the ESMF coord dimensions, the fv3-gocart test is working. I will run the hafs tests to see if there is any impact. I am curious what is the reason that we use the ESMF array set up from dycore array, instead of getting coord dimension from the ESMF grid.
|
@BinLiu-NOAA I ran all the coupled tests, one of gobal atmonly, one of the regional test and all the HAFS tests, these tests finished successfully. Would you please add the changes above to module_fcst_grid_comp.F90 in your hafs branch and rerun the full RT to see if there is any issue. You may also run some hafs moving nest cases to confirm it works as before. Thanks |
@junwang-noaa, this is great news! And thanks for looking into this and providing a fix/workaround for this! If you would like to, you can commit your fix back to the feature/hafs_nest_cpl branch, for which you should already have the correct permission. I will then sync it up and conduct a full regression test on Hera and make sure all the RTs can go through. Will let you know how it goes later today or early tomorrow. Thanks! |
@junwang-noaa, I have committed your fix into the feature/hafs_nest_cpl branch for PR #1104 and also synced it with the latest develop branch. After that, I conducted the standard regression tests with hera.intel, and all the RTs run through. Beside, it confirmed that this PR only changes the baseline results for the related HAFS RTs, and does not change results for other RTs. |
@junwang-noaa, I checked the moving-nest output netcdf files a little more carefully. And it looks like your above fix can run through, however the output grid lat/lon are not updated. And this is because in your fix, the disgrid lon/lat array were not created/pointed to Atmos%lon/lat structure. For example,
So, it looks like we need do the same thing for the distgrid with lat/lon and lat/lon_bnd. Thanks! |
@BinLiu-NOAA Thanks for the testing. I think another way is that we move the code updates within the global if statement and get the original code to nest grid if statement as the global domain coordinates will not change. |
Thanks, @junwang-noaa! This sounds better, and I am testing this as you suggested. |
@junwang-noaa, the above method (moving your code change into the global if condition section while using the distgrid method in the else (nested/regional) section) worked properly for the moving-nesting output grid with updated output grid lat/lon. I am conducting the full regression tests now to make sure and confirm it will still work for all other RTs (including the GOCART coupling ones). Will keep you posted on how it goes. Thanks! In case you would like to take a look at the latest related code changes, please see starting from: |
I do not like the fact that we use two different methods of setting the grid coordinates for different domains (global vs. non-global). It's inconsistent and confusing and complicates the code. Are the grid coordinate array bounds incorrect only for the global domain when distgrid is used? |
@DusanJovic-NOAA I completely agree. I am puzzled how the HAFS code of defining an ESMF array with different coordinate dimensions can map to the internal ESMF grid coordinates. Also it looks to me that the previous way (the code I updated) only sets up the ESMF grid coordinates once, while using ESMF array to set up the coordinates, the coordinates will change when the internal array (atmos%lon(lat)_bnd) is changed, maybe a better way is to create two new arrays lon_bndesmf/lat_bndesmf in atmos with dimensions matching with the ESMF corner grid coordinates and update with lon_bnd/lat_bnd at each time step. @theurich @danrosen25 FYI. |
Just a quick follow-up, the full RTs passed on wcoss_dell and it confirmed only changes results for HAFS related RTs and does not changes other RT's baseline results. |
@junwang-noaa and @DusanJovic-NOAA, it looks to me that the reason it did not work properly for the moving-nesting output is that inside the wrtgridcomp, it queries the moving nest domain coordinates from distgrid instead of from grid, for example the code sections after this line: So, with this, I think if we attach/update both grid and distgrid's lon/lat[_bnd] for both global and nonglobal grids from the fcstgridcomp side, it should then work for both GOCART coupling and moving-nesting outputting. Maybe another alternative method is to query grid's (instead of distgrid's) lon/lat coordinates from the wrtgridcomp side. But I am not sure if that will work properly for the moving-nest output grid. Thanks! |
@BinLiu-NOAA I'm not sure I understand what you are proposing. Distgrid does not hold coordinates (lat/lon values). Grid holds coordinates. |
One of the tests that failed (cpld_debug_p8) is C96 running on 3x8 layout, which means each task will have 32x12 grid points, ie. 32x12 center coordinates and 33x13 corner coordinates. Which is consistent with the results from "the HAFS branch" see this comment (#1053 (comment)) |
@DusanJovic-NOAA, I was referring to this code section (lines 1833 to 1842), which tries to query the distgrid's coordinate indices, which will then be used to determine the moving nest grid center. Regarding Gerhard's distgrid method to attach Atmos%lon/lat[_bnd] to fcstGrid vs Jun's method to attach Atmos%lon/lat[_bnd] directly to fcstGrid, I am not sure I fully understand their differences. And not sure why the direct attaching method does not work properly for moving-nesting output grid, but the distgrid method works fine. But, anyway, @theurich probably knows the reason. Thanks! |
@BinLiu-NOAA I do not see anything to do with the write grid comp. The write grid comp only gets whatever forecast grids that the forecast grid comp defines and uses the index from the coordinates, it does not define the grid. From what I see, there is inconsistency with the ESMF grid corner coordinate definitions:
the totalLBound and totalUBound shows: 1:32 and 1:12 for C96
The Atmos%lon_bnd has dimensions: 1:33 and 1:13. So if we use 2), and we call the ESMF subroutine to get coordinates by: call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, & it will give dimension 1:33 and 1:13. While if we use 1), we still get dimension 1:32 and 1:12. With 2) if we create a new field with the domain dimension which is (1:32,1:12), the error code 509 shows up in the ESMF_fieldcreate as seen in MAPL and in the test code in module_fcst_grid_comp.F90. We have no problem to create new fields with 1). |
Sorry, I still do not understand your point about the code in the write grid component and distgrid's coordinate. distgrid has nothing to do with the coordinates i.e, latitude and longitude values. That block of code is looking at which task (PE) is holding the center index and then looks at that index in coordinate arrays, and those arrays are extracted from the grid, not distgrid. Jun's method to attach Atmos%lon/lat[bnd] directly to fcstGrid, this method is not attaching anything to fcstGrid, it just copies the values from Atmos%lon, and that's done only once during the initialization, that's why it does not work for the moving nest. |
Yes. We need ESMF group @theurich or @danrosen25 to help us understand how the atmos%lat_bnd and atmos%lonbnd are mapped to ESMF internal grid corner coordinates. Also since the interface was working with GOCART/MAPL ( I assume other GOCART host models have same ESMF grid coordinates as we have in the current fv3atm develop branch), I would suggest to check if the method 2) can be updated to match the ESMF grid coordinate dimensions. |
@DusanJovic-NOAA, never mind. I was just trying to understand the difference(s) between the distgrid method vs Jun's method to attach the Atmos%lon/lat[_bnd]. My understanding is that with the distgrid method, the array is created on the distgrid and with a pointer (farray) pointing to Atmos%lon (e.g., lines 256 to 259), thus the moving nest lon/lan will be automatically updated, passed to and accessible at the wrtgridcomp side. Jun's direct attaching method might not work properly in this case. But anyway, this is just my current understanding/guess, which might turn out to be totally wrong. Thanks! |
Here's what I thing the issue is. Shape of Atmos%lon and Atmos%lat is (32,12) while shape of Atmos%lon_bnd and Atmos%lat_bnd is (33,13) on all PEs. So, on each PE corner arrays in Atmos type have one more element is each direction compared to center arrays. This does not correspond with distgrid for corner locations, for example, this is the size (computed as maxIndexPDe - minIndexPDe + 1) for first 24 tasks (3x8 layout of the first tile): center distgrid: corner distrid: only the PEs on the ending edge of the tile have one extra element. This is why, I think, creating ESMF_Field at corner locations fails because the local array (ptr in the above example) is allocated based on a size of a farrayPtr extracted from the array added to the grid (all are the same size) which is not the same as the grid's distgrid sizes. |
Yes, exactly. As you see the Ptr gets its dimensions from: call ESMF_GridGetCoord(fcstGrid(1),localDE=0,coordDim=1,staggerloc=ESMF_STAGGERLOC_CORNER, & so above ESMF call seems giving different dimensions from the corner distgrid dimensions. |
@junwang-noaa and @DusanJovic-NOAA, In this case, I was thinking that one should probably query the ESMF_STAGGERLOC_CENTER's array's dimension, then add 1 when allocating the corner array, which can then be used to create the ESMF_STAGGERLOC_CORNER field. For example, something like below based on your original version. Do you think this might work without the rc=509 error? !jwtest: Another maybe stupid question, is it possible to create the corner stagger field directly from the fcstGrid (without first allocating the array) and let ESMF_FieldCreate to allocate the array automatically/properly based on the staggerLoc=ESMF_STAGGERLOC_CORNER information? Thanks! |
@junwang-noaa please take a look at the changes in this branch https://github.com/DusanJovic-NOAA/fv3atm/commits/feature/hafs_nest_cpl this commit DusanJovic-NOAA/fv3atm@a2010f8 @BinLiu-NOAA can you take the above branch and run full test. Thanks. |
@DusanJovic-NOAA Yes, the code changes look good to me. |
Thanks, @DusanJovic-NOAA! Sure, I am starting running the full regression tests with your branch now. |
Just a quick follow-up, the full regression tests finished on Hera with hera.intel. And here is the RT run dir: |
@BinLiu-NOAA Where is your baseline? Can you check the results from moving nest output? |
@junwang-noaa, I was using the existing develop branch baseline BL_20220401. Just to try to confirm this PR does not change the baseline results for all other RTs (except for the HAFS related regression tests). For HAFS moving-nesting output, I manually/visually checked the output netcdf files, and they look reasonable, with the output grid lat/lon updated properly (following the moving nest domain). Thanks! |
@BinLiu-NOAA Thanks for confirming the HAFS tests work properly, let's get your PR back to the commit queue. |
Thanks, @junwang-noaa! @DusanJovic-NOAA, if it works for you, I will merge your https://github.com/DusanJovic-NOAA/fv3atm/tree/feature/hafs_nest_cpl branch back into https://github.com/hafs-community/fv3atm/tree/feature/hafs_nest_cpl branch, and then go ahead sync with the latest develop branches to get the PR #1104 up to date. Thanks! |
Merged in c2e6b22 |
… EE2 compliance issues for AQMv7 implementation (ufs-community#1053) * create a branch to update build.sh by removing hardcode with UPP * correct daily 1hr and 8hr max o3 double-alerting issue for o3 bias correction script * Update ecflow time trigger for 18Z * fix the forecast job failure and reproducibility of results * clean up var_defns.sh.template * remove 0-size of data_table from parm and update hash numbers for ufs-weather-model with nowarn fix and ufs_util * Fix model crash issus found with diag_table and add exception handling code in forecast manager * removing scaling factors for fire emissions * remove additional exec code from UFS_UTILS * update aqm_bias_correction for O3 * replace cp with cpreq for exaqm scripts * update ecflow scripts to address computer resource issues provided by NCO * update fire_emission by removing scaling factors * getting scaling factor back for RAVE v1 fire product * udpate exaqm script to address AQM v7.0 failure mode testing issues --------- Co-authored-by: Lin.Gan <[email protected]>
Description
With the effort to resolve issue #555 through PR #1044, now the write grid component within ufs-weather-model/FV3atm can properly output both parent and static nested domains/grids. However, when outputting a moving nest domain/grid, the output fields are on the wrong locations, due to the fact the forecast grid component's compute domain moves during the model integration (e.g., following a tropical cyclone).
Solution
Discussions among the ESMF team, EMC EIB and hurricane project team, AOML/HRD collaborators have figured out a strategy to expand the current write grid component capability to further support moving nest output. Work is ongoing to implement this capability from the ESMF team, with assistance from other groups mentioned above.
Alternatives
N/A
Related to
Directly reference any issues or PRs in this or other repositories that this is related to, and describe how they are related.
The text was updated successfully, but these errors were encountered: