-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Soil level clm surface input data for clm5.2.0 have missing values in large domains #2744
Comments
I am happy to provide additional details or more plots. @mvertens, @mvdebolskiy @rosiealice, here is my write-up of the soil/dust data issue. Perhaps you also know whether this needs to flagged for anyone in particular. |
Thanks for raising this issue, @maritsandstad. I wondered if this was an issue with the regridding or the dataset, so I tried plotting with uxarray, which seems to confirm your concern. We'll have to look at this for the CTSM5.3 datasets that @slevis-lmwg and @ekluzek are working on finalizing now. |
visually, it looks like the same issue is present in the 5.3 surface data that @olyson is using for the ne30 spinup |
I think this should be addressed in #2500 before we make the 5.3 tag. It's also likely influencing dust emissions, @dmleung, assuming your using the ne30 grid in your testing? Also pinging @dlawrenncar here too, as this will delay our release of 5.3 |
@wwieder, yeah I also just saw this for the soil variables, but I also saw the same for WATSAT from initial outputs from a testrun. |
@wwieder - can you look at any of the other unstructured surface datasets (e.g. ne30np4 - or any other refined grids) and see if the same problem arises. Is this just for ne30np4.pg3? |
It would also be very helpful to know if soil color shows this issue; has anyone checked that? |
@wwieder notes that it "doesn't look like an issue in older surface datasets created by |
Also worth checking to see if any structured-grid datasets are affected. |
I just spent a while on a call with @mvertens . In addition to the ideas noted above, we agreed that the best next step would be to output the regridded mapunit created as an intermediate step in creating these soil variables. I can make a branch this morning that adds that output. |
Good questions @samsrabin older datasets created with mksrfdata_map are OK @samsrabin a quick look at different structured grids at different resolutions look fine. @mvertens other unstructured grids also look like the one originally noted here, missing much of Africa an other regions around E Asia and W North America all the 5.3 datasets I'm looking at are here: |
@billsacks since this worked with mksurfdata_map which used ESMF mapping, but a different version of ESMF -- could this be an ESMF glitch for specific versions of ESMF? Or maybe there are different options we used for mapping of the soil data in mksurfdata_map? The soil dataset is different so it could be something about mapping from the new soil data grid to the model grid. |
Thank you @wwieder - this is very helpful. @ekluzek - @mvertens walked me through the regridding code for soil texture. The implementation is significantly different in mksurfdata_esmf compared with the old mksurfdata_map in some ways. But there are essentially two parts of what's going on: (1) regridding the mapunit and then (2) taking the regridded mapunit and from it creating and outputting the soil properties on the output grid. We felt that outputting the mapunit will let us narrow down the problem. It's especially interesting that soil color looks fine, because that uses a very similar (though not identical) algorithm to the regridding of mapunit for soil properties. Anyway, I'll move ahead with adding the outputting of mapunit. |
This plan makes sense @billsacks checking that the mapunits are regridding correctly seems like the next step. |
@wwieder - thanks for looking at the other resolutions. It looks like the problem has to do with a cubed sphere grid (both C96 and spectral element resolutions are on a cubed sphere grid). So this analysis is really helpful. |
@mvertens and @billsacks thanks so much for digging into this, we really appreciate it! This will take much less time for the two of you to solve. After you are done it might be good to show more about this to the CTSM SE so we can also get some knowledge on this area. And @maritsandstad thanks for pointing this out as well -- really helpful for the community to let us know about things like this. |
Oh, |
One other thing to think about in terms of implications for ongoing runs is whether or not this is affecting the km-scale simulations that @briandobbins is working on. He is getting 'close' to a production-ish run and it would be good to know if this is impacting those super high-resolution runs. |
@dlawrenncar this is looking like it's an issue with the cubed sphere grid and NOT the mpasa grids (as mpasa120 looks OK from Will's plot above). By extension that likely means that mpasa15 and mpasa3p75 are probably OK. But, still not bad to check. |
I checked an f09 surface dataset before and after the fix. Only 8 grid cells differ in their |
The fix is here: slevis-lmwg#9 |
(I also did a run with the ne30 resolution where I changed the code to set mapunit equal to |
I'll be curious if this also addresses #2502? |
Yes, good point, this is almost certainly the culprit that explains #2502 - because the nature of this bug means that there would likely be different answers with different processor counts / decompositions. |
If this bug is going to affect the MPAS grids as well, can someone work with @briandobbins to create a new ultra high-resolution dataset for the runs that he is working on. There is some urgency on this, as the runs need to be completed quite soon. Not urgent as in this needs to be done before the weekend, but early next week preferably. |
@dlawrenncar yes we will do that right away. When we recreate all the datasets the mpasa15 and mpasa3p75 are done at the same time. And rerunning all the datasets will be done by early next week, it may or may not happen over the weekend. So as long as @briandobbins needs one of those we'll get it for him. |
@wwieder, I was testing dust in the ne30 grid (#2732) using the same surface dataset yes. The new dust needs some tuning given all the model updates, but thus far I didn't see very significant impacts from the soil variables. The important variables that explicitly affect dust are PCT_CLAY and PCT_NAT_PFT etc. From the conversation here it seems PCT_CLAY is off, so I will need to double check on how this issue impacts my tests. Thanks for letting me know about this! |
@dmleung can you have a look at dust emissions from cases that use this surface dataset: @briandobbins there should also be 3 km mpas grid ( |
@wwieder Thanks, that's looking much better to me - here's the clipped new one: |
That looks more like planet Earth! |
@jrbuzan - yes, this issue would be expected to impact the BULK variable, too. |
Just to clarify, we're expecting incorrect BULK on 5.2 and older 5.3 surface datasets created with mksrfdata_esmf- but it should be corrected on the newest 5.3 datasets Sam created (or that @jrbuzan could create with corrections to mksrfdata_esmf that Bill made last week)? |
Yes, that's what I'd expect - thanks for stating that more clearly. |
Brief summary of bug
Input surface data for clm on spectral element grid ne30 (inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_ne30np4.pg3_hist_1850_78pfts_c240216.nc) at least for clm5.2.0 have weird/missing values for parts of the grid.
Here is an example showing organic carbon summed over the soil levels and similar patterns of missing data seem to be present in all the soil level data I've tried plotting including PCT_CLAY, PCT_SAND:
This pattern shows up similarly in output datasets
General bug information
Plotting and regridding input data from surface input data for spectral element grid ne30 for clm5.2.0 (inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_ne30np4.pg3_hist_1850_78pfts_c240216.nc) data for variables on soil levels seem to be looking weird and missing data for what looks like some of the spectral element domains.
Above is an example plot of what organic carbon summed over soil levels looks like, but the pattern of missing data seems to be the same for all the soil level data that I have plotted so far. Looking at the LMR picture framing from https://journals.ametsoc.org/view/journals/mwre/134/12/mwr3360.1.xml I can almost see at least some of the domains that are missing, here is an attempt at colouring in the missing parts:
CTSM version you are using: [output of
git describe
]clm5.2.0 ne30
Does this bug cause significantly incorrect results in the model's science? [Yes ]
Configurations affected: [Fill this in if known.]
Details of bug
See above
Important details of your setup / configuration so we can reproduce the bug
I regridded using the xesmf package following this excellent guide: https://ncar.github.io/esds/posts/2022/cam-se-regridding/
For reference here is my regridding code snippet
I used this file for mapping inputdata/lnd/clm2/mappingdata/maps/ne30pg3/map_ne30pg3_to_0.5x0.5_nomask_aave_da_c180515.nc
The regridding looked ok and completely sane for other variables that didn't have soil level, and the pattern persisted regardless of whether I regridded a variable with the soil level dimension, or if I summed over and thereby eliminated the soil level dimension before regridding. It also looked the same if I regridded the same output variables from the first history files from a run.
Important output or errors that show the problem
See above
The text was updated successfully, but these errors were encountered: