Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver: icedrv_restart: Added option for restart files to be read/written in NetCDF format #427

Merged
merged 30 commits into from
Mar 3, 2023

Conversation

davidclemenssewall
Copy link
Contributor

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • [X ] Short (1 sentence) summary of your PR:
    Added option for restart files to be read/written in NetCDF format (except for BGC tracers).
  • [X ] Developer(s):
    David Clemens-Sewall (building off code written by Chris Riedel)
  • [X ] Suggest PR reviewers from list in the column to the right.
  • Tony Craig and Dave Bailey
  • [X ] Please copy the PR test results link or provide a summary of testing completed below.
    I've added the test suite 'netcdf_nobgc_suite' to check that switching the restart files to NetCDF doesn't change any output. This suite contains all of the restart tests from Icepack's base_suite (except for the bgc tracers) with the setting 'ionetcdf' activated. The modified code passes all 64 tests. Also, Icepack base_suite and CICE quick suite both pass with 128/128 and 16/16 respectively.
  • How much do the PR code changes differ from the unmodified code?
    • [X ] bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • [X ] No
  • Does this PR add any new test cases?
    • [X ] Yes (the test suite: netcdf_nobgc_suite. Which is using netcdf output on all of the restart tests in base_suite except for the bgc one).
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • [X ] No, does the documentation need to be updated at a later time?
      • Yes
      • [X ] No
  • [X ] Please provide any additional information or relevant details below:
    This PR doesn't implement NetCDF restart files for BGC tracers (which is outside the current scope of my project). It does add an additional namelist variable 'restart_format' to indicate whether the restart files should be read/written in binary format ('bin') or NetCDF ('nc'). The default is 'bin'. However, if the "ionetcdf" setting is set then I chose to have the restart_format namelist variable set to 'nc'. Which seemed consistent with the idea that if the user sets "ionetcdf" they probably want all of the input and output to be netCDF.

The start time for netCDF output was hardcoded to be at 00:00:00
and there was a bug in icedrv_calendar such that idate0 was fixed
at Jan. 1. Both bugs are fixed. NOTE: if dt and istep0 are set
such that the start time contains fractional seconds, they will
be rounded down in the NetCDF output to the nearest integer.
Start of netCDF time axis is fixed at Jan. 01 00:00:00 of
init_year and time value calculation is now fixed.
No functionality implemented yet, just checking it's added correctly.
…ad as NetCDF

Added the option for restart (aka initial condition) files to be in NetCDF
instead of binary for ease of setting custom inital conditions
@apcraig
Copy link
Contributor

apcraig commented Feb 1, 2023

This looks interesting. Several comments.

  • warnstr and icepack_warnings_add should not be in icepack_intfc.F90. Errors in the icepack driver should be written directly to the icepack driver output unit number (normally nu_diag).
  • I am a little concerned that bgc is not included here. If it's not possible to include that now, we should have a plan to update that soon.
  • This seems to include some of the changes to the netcdf time axis. We'll need to be careful not to redo/undo/conflict with driver: icedrv_history: fix start time for NetCDF history output #426. I assume you created this branch from that branch? We'll want to merge driver: icedrv_history: fix start time for NetCDF history output #426 first, then have you update your branch and fix/remove the time axis changes in this PR.
  • In icedrv_restart, the binary and netcdf writing of fields is not done the same way. For binary, subroutines are called like write_restart_age, write_restart_FY, etc. For netcdf, you have inlined the fields that are read/written in those other subroutines. This is dangerous as the "same" implementation is now split into two places and is more likely to diverge. The extra subroutines should be called in both cases and those subroutines need to be modified to support both binary and netcdf in the same way as aicen, vicen, etc.
  • I wonder if we should change history_cdf to history_format? Or maybe we should just have a logical restart_cdf. I think it would be nice if the namelist to control history and restart format be the same.
  • We should think about including the new test suite in io_suite.ts and I think we may want to have a test that has history=netcdf with restart=binary and history=binary with restart=netcdf.

Good to see CPP for USE_NETCDF and appropriate aborts.

@davidclemenssewall
Copy link
Contributor Author

Thanks @apcraig! Comments in line below.

This looks interesting. Several comments.

* warnstr and icepack_warnings_add should not be in icepack_intfc.F90.  Errors in the icepack driver should be written directly to the icepack driver output unit number (normally nu_diag).

Ah gotcha. Would you mind pointing me to a spot in the code where a warning is added like that? I.e., where we just provide a warning not an abort. Alternatively I'm open to this error causing an abort since the user did ask for a restart file that wasn't written (even if the code itself otherwise ran correctly).

* I am a little concerned that bgc is not included here.  If it's not possible to include that now, we should have a plan to update that soon.

I agree that it would be good to include bgc. I'm not sure that I can justify spending time on it to my PI because we don't need bgc for the project that's funding me but perhaps if it's quick to do.

* This seems to include some of the changes to the netcdf time axis.  We'll need to be careful not to redo/undo/conflict with [driver: icedrv_history: fix start time for NetCDF history output #426](https://github.com/CICE-Consortium/Icepack/pull/426).  I assume you created this branch from that branch?  We'll want to merge [driver: icedrv_history: fix start time for NetCDF history output #426](https://github.com/CICE-Consortium/Icepack/pull/426) first, then have you update your branch and fix/remove the time axis changes in this PR.

Correct. This is from the branch in pull request #426. I figured I would send this pull request now so we can start talking about what needs to be fixed in it. Then once #426 is merged into the consortium main I'll merge it into this branch (before taking car

* In icedrv_restart, the binary and netcdf writing of fields is not done the same way.  For binary, subroutines are called like write_restart_age, write_restart_FY, etc.  For netcdf, you have inlined the fields that are read/written in those other subroutines.  This is dangerous as the "same" implementation is now split into two places and is more likely to diverge.  The extra subroutines should be called in both cases and those subroutines need to be modified to support both binary and netcdf in the same way as aicen, vicen, etc.

Your concern makes sense. I'll take a look at how to do this efficiently.

* I wonder if we should change history_cdf to history_format?  Or maybe we should just have a logical restart_cdf.  I think it would be nice if the namelist to control history and restart format be the same.

I agree that it would be nice if they were the same. CICE uses restart_format and history_format so I would vote for changing history_cdf to history_format.

* We should think about including the new test suite in io_suite.ts and I think we may want to have a test that has history=netcdf with restart=binary and history=binary with restart=netcdf.

That sounds good to me. Do we already have tests that validate history output? Or would that test just verify that icepack completes a run with history output indicated?

Good to see CPP for USE_NETCDF and appropriate aborts.

@eclare108213
Copy link
Contributor

I would also vote for using the same namelist flags and options/values for history and restart as in CICE. Less confusing.

@apcraig
Copy link
Contributor

apcraig commented Feb 1, 2023

If you look in icedrv_init.F90, you'll see various examples of writing and/or aborting directly.

         call icedrv_system_abort(string=subname//'ERROR: open file '// &
            trim(nml_filename), &
            file=__FILE__, line=__LINE__)
      if (ncat == 1 .and. kitd == 1) then
         write (nu_diag,*) 'Remapping the ITD is not allowed for ncat=1.'
         write (nu_diag,*) 'Use kitd = 0 (delta function ITD) with kcatbound = 0'
         write (nu_diag,*) 'or for column configurations use kcatbound = -1'
         call icedrv_system_abort(file=__FILE__,line=__LINE__)
      endif
         write (nu_diag,*) 'WARNING: ITD required for ncat > 1'
         write (nu_diag,*) 'WARNING: Setting kitd and kcatbound to default values'

Depending where you are in the code and what the error is, you may want to write to either nu_diag or nu_diag+n-1. The latter is to write to each column's output file. The former is the general output file.

Lets change the namelist to history_format and restart_format, are you able to do that?

We do not have tests that formally validate history output. We just check the model completes fine.

Thanks.

@davidclemenssewall
Copy link
Contributor Author

Thank you for the examples. I've fixed the warnings, merged in the updated consortium: main, changed history_cdf to history_format, and updated the documentation. I'm about halfway through reorganizing icedrv_restart as you suggested. I should be able to finish that sometime tomorrow.

Do you want me to move the new test suite into io_suite.ts and add in the mixed history/restart format options?

@dabail10
Copy link
Contributor

dabail10 commented Feb 3, 2023

The compile is failing here. The variable nchar needs to be declared in the subroutine restartfile.

@dabail10
Copy link
Contributor

dabail10 commented Feb 3, 2023

Actually, it's in several other subroutines. It should probably be a module variable.

@apcraig
Copy link
Contributor

apcraig commented Feb 3, 2023

With regard to testing, this is what I'd do. set_env.ionetcdf can stay as is. Rename set_nml.ionetcdf to set_nml.histcdf and it would be just "history_format = 'nc'". Then add set_nml.restcdf which would be just "restart_format = 'nc'".

The I would remove netcdf_nobgc_suite.ts and then add the following to io_suite.ts,

restart        col     1x1        debug,ionetcdf,histcdf
smoke          col     1x1        run1year,diag1,ionetcdf,histcdf
restart        col     1x1        debug,ionetcdf,restcdf
smoke          col     1x1        run1year,diag1,ionetcdf,restcdf
restart        col     1x1        debug,ionetcdf,histcdf,restcdf
smoke          col     1x1        run1year,diag1,ionetcdf,histcdf,restcdf

I think this is adequate but if you also want to add a couple other cases to test exact restart of other physics (like netcdf_nobgc_suite.ts does), that's fine too. In general, we can't test every option with each other as we'd end up with millions of tests. So in this case, we're testing the default physics with various combinations of binary and netcdf history and restart to cover this feature.

By the way, noticed in the current PR, you still need to add restart_format to ug_case_settings.rst and icepack_index.rst. Didn't do a thorough review yet, just noticed that so I'll mention it. Thanks!

@davidclemenssewall
Copy link
Contributor Author

Sounds good. I have finished reorganizing the code, changing the namelists and tests, etc. This most recent version passes all tests in base_suite, io_suite, the nobgc netcdf restart suite (which I've now removed), and the CICE quick_suite. I think that I've updated the documentation in the correct places, but let me know if I've missed anything there. I'll be out of the office for the next two weeks so if any further fixes are needed I'll get to them after then. Thanks!

@davidclemenssewall
Copy link
Contributor Author

The automated testing is failing with this error:
./icepack.setup: ERROR, /Users/runner/icepack/configuration/scripts/options/set_[nml,env].restcdf# not found

Which is confusing to me because set_nml.restcdf is present (and worked in my ubuntu container).

@apcraig
Copy link
Contributor

apcraig commented Feb 6, 2023

I think the CI is failing because you may need a return character at the end of the io_suite.ts file. It looks like the CI env is adding a # there for reasons I don't understand. But I suspect returning to a new line will fix that.

The other thing I suggest is setting the default value of history_format to "none". Also, please add checks in icedrv_init.F90 to check for valid values for history_format and restart_format. Those valid values should be "['none','cdf'] and ['bin','cdf'] at this point I believe. Finally, the documentation of these two namelist variables in doc/source/user_guide/ug_case_settings.rst should follow the style for namelist with limited options,

   "``history_format``", "``cdf``", "history file output in netcdf format", "``none``"
   "","``none``","no history output",""

   "``restart_format``", "``bin``", "restart file output in binary format", "``bin``"
   "","``cdf``","restart file output in netcdf format",""

I will run some tests now just to check things are working as expected too. Thanks!

  - icedrv_init write format for history_format
  - dims allocation in icedrv_restart
  - cleaned up some intent lines (old)
  - add return at end of io_suite.ts
  - update namelist documentation of history_format, restart_format
Update to current main trunk
@apcraig
Copy link
Contributor

apcraig commented Feb 7, 2023

I created a PR, davidclemenssewall#1, based on testing on cheyenne.

Check history_format and restart_format input values
@apcraig
Copy link
Contributor

apcraig commented Mar 1, 2023

@davidclemenssewall, have you had a chance to review my proposed changes? It would be good to get this PR merged. Thanks.

Fix minor issues found during testing
@davidclemenssewall
Copy link
Contributor Author

@apcraig sorry for my delay. I'm still digging out from stuff that built up during the mosaic conference and the polar climate working group. All of the changes look good and thank you for fixing my errors (I just started learning fortran recently). I also checked that the base_suite and io_suite passed all tests on Ubuntu.

@apcraig
Copy link
Contributor

apcraig commented Mar 1, 2023

@davidclemenssewall, no problem. just trying to keep things moving along. I will run a final test myself then will merge. Thanks for all your efforts, great to have new people contributing.

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the Icepack test suite on cheyenne and all passes as expected. I will merge tomorrow unless someone squeaks.

@apcraig apcraig merged commit 37e215b into CICE-Consortium:main Mar 3, 2023
@davidclemenssewall davidclemenssewall deleted the restart_netcdf branch March 6, 2023 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants