Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new surface datasets, CTSM5.2 branch #1903

Closed
8 tasks done
wwieder opened this issue Nov 15, 2022 · 13 comments · Fixed by #2372
Closed
8 tasks done

Create new surface datasets, CTSM5.2 branch #1903

wwieder opened this issue Nov 15, 2022 · 13 comments · Fixed by #2372
Assignees
Labels
size: large Large project that will take a few weeks or more

Comments

@wwieder
Copy link
Contributor

wwieder commented Nov 15, 2022

Includes surface dataset project #1873, and #1868, as well as significant changes to mksurfdata_esmf (e.g. #1791) gross unrepresented land use transition #309, dynamic urban datasets #1157, and more...

  • Setup makefile to generate all the datasets listed in this spreadsheet
  • create datasets, copy them to /glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0
  • update namelist_defaults_ctsm.xml
  • bld/unit_testers/build-namelist_test.pl
  • Confirm that ctsm_sci covers all resolutions
  • Run aux_clm and ctsm_sci test-suites pointing to the new datasets IN PROGRESS
  • Generate the NEON fsurdat files (see Create all of the NEON surface dataset for CTSM5.2 #2319)

The next step waits for merge of #2318. Details in this card on the project board.

  • rimport all new datasets
@wwieder wwieder changed the title CTSM5.2 surface datasets #1868 Create new surface datasets, CTSM5.2 branch Nov 15, 2022
@billsacks billsacks added the size: large Large project that will take a few weeks or more label Nov 15, 2022
@slevis-lmwg
Copy link
Contributor

@ekluzek you asked about the space that the current list of files takes. I typed
du -h in /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0 and got
727G

@wwieder
Copy link
Contributor Author

wwieder commented May 31, 2023

Along with this, @slevis-lmwg or @olyson were you going to prepare a list of all the grids, resolutions, & time periods we're supporting (along with their approximate file size). I need some kind of list to bring to co-chairs to have a discussion on file creation and storage.

Second question, I though part of the feature of the new mksurfdataESMF was that we'd have the ability to do mapping 'on the fly' or at runtime, especially for high resolution or variable resolution grids. Is this still a feature of the system that we should highlight?

@slevis-lmwg
Copy link
Contributor

@wwieder @ekluzek @olyson

  1. I met with @ekluzek yesterday. We updated the list of grids/periods in this file, keeping in mind that this list is still incomplete and does not include file size info:
    /glade/scratch/slevis/CTSM/tools/mksurfdata_esmf/mksurfdata_jobscript_multi_master
    I can update this list into a table with file sizes that @wwieder can take to the co-chairs for discussion.
  2. I have generated the corresponding datasets from this list and moved them here:
    /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0
  3. Today I met with @olyson and we discussed the logic of HIST versus SSP compsets. Our understanding @ekluzek is that currently SSP compsets by default start in 2015 rather than 1850. So Keith was suggesting that we generate one 1850-2000 timeseries rather than one per SSP. Is this default changing @ekluzek to allow for continuous simulations from 1850-2100? If so, then we do generate 1850-2100 for all the SSPs, PLUS @olyson suggested adding a separate 1850-2000 timeseries not associated with any SSP to avoid confusion.

@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented May 31, 2023

On the second question, the answer is no, such a feature does not exist at this time. My understanding is that for files and timeseries that take up huge amounts of disk space, we will not make the files ahead of time. Rather they will be generated on an as needed basis.

@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented May 31, 2023

Current ctsm5.2 list of surface and landuse datasets with file sizes:
/glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/ls_dash_lhq_star_nc_c230530.asc

@ekluzek
Copy link
Collaborator

ekluzek commented May 31, 2023

Let's talk about this some in our software meeting tomorrow. It would be good to be on the same page about this in terms of what datasets to create. Part of the reason to only produce a 1850-2100 landuse timeseries file for most resolutions is that you can use it for both historical, for future scenarios, AND for present day where you need data after 2015. This means you have one file that can be used for those three different purposes and you don't have to have a different file for each.

@wwieder I think the answer for you is "yes" the mapping is done on the fly in mksurfdata_esmf, whereas mksurfdata_map required you to create mapping files first as a separate step that you do before running mksurfdata_map. Hence, the CTSM5.2 branch removes the mkmapdata script from the tools directory. All that's needed to support a new grid to run CTSM at is a mesh file for that grid. Does that answer the question you are asking?

@slevis-lmwg
Copy link
Contributor

On the second question, the answer is no, such a feature does not exist at this time. My understanding is that for files and timeseries that take up huge amounts of disk space, we will not make the files ahead of time. Rather they will be generated on an as needed basis.

@wwieder my response differs from Erik's because I assumed you meant "at runtime" as in during a CTSM simulation.

@slevis-lmwg
Copy link
Contributor

Update after this morning's SE meeting (please let me know if you catch errors):

  • Keep 1850-2100 (SSP) time series for 1-deg, 2-deg, and f10 (aka standard resolutions).
  • Add 1850-2015 (hist) time series for standard resolutions.
  • Keep 1850-2100 for SSP5-8.5 for certain additional resolutions.

New ctsm5.2 list of surface and landuse datasets with file sizes:
/glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/ls_dash_lhq_star_nc_c230601.asc

@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented Jun 1, 2023

@ekluzek

In the cheyenne test-suite, this test fails
ERI_D_Ld9.T31_g37.I2000Clm50Sp.cheyenne_intel.clm-SNICARFRC
because we do not generate fsurdat for T31. Is it ok with you if I change this test to run with f10 or one of our other grids?

Answer is YES to f10: I changed the grid in testlist_clm.xml for this test.

Regarding the 4x5 grid:

  • We have a testmods directory /FatesColdLandUse (category="fates") asking for a 4x5 landuse file. Could we test FATES at 10x15 in this case or is that just not done, in which case we also need a 4x5 landuse file?
  • Also I hoped to stop generating the 78-pft versions of 4x5, but I see a couple of non-Fates tests using 4x5. Is it ok if I change them to 10x15 or would you rather not?

Answers:

  1. Keep 4x5 16-pft for FATES: I updated the testmods directory. I updated gen_mksurfdata_jobscript_multi.py and mksurfdata_jobscript_multi_master. I'm generating the 4x5 landuse file. I will need to move it to .../inputdata/...
  2. For non-FATES, it's ok to change to f10: I changed the grid in testlist_clm.xml for these tests.

This test fails
SMS_Ln9_P72x2.C96_C96_mg17.IHistClm50BgcCrop.cheyenne_intel.clm-clm50cam6LndTuningMode
because we do not generate a landuse file for C96. We generate C96 fsurdat for 1850 and 2000. Should I be generating the landuse file, too?

Answer is YES, generate landuse timeseries for C96: I updated gen_mksurfdata_jobscript_multi.py and mksurfdata_jobscript_multi_master. I'm generating the C96 landuse file. I will need to move it to .../inputdata/...

These tests fail
SMS_Lm13_PS.f19_g17.I2000Clm51BgcCrop.cheyenne_intel.clm-cropMonthOutput
PFS_Ld10_PS.f19_g17.I2000Clm50BgcCrop.cheyenne_intel
LII_D_Ld3_PS.f19_g17.I2000Clm50BgcCrop.cheyenne_intel.clm-default
because the number of landunits has changed from what is found in the clmi file.
Adding use_init_interp = .true. to the namelist should fix this. Is that acceptable, or do we need to generate a new clmi file?

Answer is YES, use_init_interp = .true. for now, and later we will generate new finidat files for these tests.

A couple more tests fail, but I will leave them for later.

@slevis-lmwg
Copy link
Contributor

@ekluzek in today's meeting we discussed C96 for 1850-2015 because we have a test for that period, so I'm generating the landuse file. A follow-up question is whether I should generate C96 files for the SSPs, too, even though we do not have corresponding tests.

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 21, 2023

@slevis-lmwg I recommend that for resolutions that we create historical landuse timeseries files, that we create SSP5-8.5 and allow users to use that one file for both historical and SSP5-8.5 simulations.

We should do all of the SSP scenarios for only a limited set of resolutions. f10, 1-deg and 2-deg for sure for example.

@slevis-lmwg
Copy link
Contributor

@ekluzek I have now merged #2016, so I think you should be able to proceed with the makefile work. Let me know if you need anything from me.

@slevis-lmwg
Copy link
Contributor

Last checkbox checked in this issue.

See https://github.com/ESCOMP/CTSM/projects/36 for project status.
See #2372 for ctsm5.2.mksurfdata branch / PR status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: large Large project that will take a few weeks or more
Development

Successfully merging a pull request may close this issue.

5 participants