Skip to content

2020 MARBL Dev team meetings

Michael Levy edited this page Sep 7, 2020 · 56 revisions

September 8, 2020

General discussion

  1. Hi-res run
    • Keith and I decided to put run output in /glade/campaign/cesm/development/bgcwg/projects/hi-res_JRA
    • Still need to reach out to Gary S about best way to move / reshape / compress data
    • 004 is through August 0005, but we're back to waiting in the queue
    • Update on analysis tools? I'm a little behind
  2. pop-tools
    • Lots of discussion on zulip re: budget
    • I could reach out to Riley and Anna-Lena, but haven't yet

MOM Software Updates

  1. I opened PRs for both NCAR/MOM6 and ESCOMP/MOM_interface
    • Former is to show Andrew S where things stand, latter is to keep the NCAR MOM6 devs in the loop
    • Neither is ready to be merged yet
  2. Driver progress:
    • Almost done at first pass of loading surface flux forcing fields
    • Still need to add saved state to restart files
    • Second pass clean-up (I think I'll do this after getting the call to interior_tendency_compute())
      • call surface_flux_compute() for multiple columns instead of one at a time
      • back-up options for forcing fields (some can come from namelist or coupler, others from coupler or file; hard-coding in primary option first)

August 25, 2020

General discussion

  1. Hi-res runs

    1. Progress report

      • Increased node count, getting 3 months in a little over 8 hours of wallclock (should I push my luck and go for 4 months / 12 hours?)
      • 003 is through June 0002, 004 is through May 0002
    2. Permanent location for output?

      • Each run is using 11 TB of scratch space; ~5 TB for history (including CICE) and the rest are restarts

        • POP history files reach 200 TB total
        • CICE history files will be another 28 TB
        • January 1st restarts are 350 GB, 1st of other months are 429 GB (due to POP annual stream; once we add 5-day output that'll affect some months as well)
      • I only have 20 TB free on scratch

      • Does /glade/campaign/cesm make sense for it?

          Space                                      Used       Quota    % Full      # Files
        --------------------------------------- ----------- ----------- --------- -----------
        /glade/campaign/collections/cmip/CMIP6   3016.69 TB  4096.00 TB   73.65 %     5871031
        /glade/campaign/cesm                     4300.58 TB  5120.00 TB   84.00 %     8123020
        /glade/campaign/cgd/oce                   444.54 TB   550.00 TB   80.82 %     1141773
    3. Diagnostics

      • I submitted a PR to improve testing: encourages users to setup pre-commit to run black; adds Github Actions for black and pytest
      • Keith is working on a PR to add more plots: he's pointed out that the notebooks are getting extremely large, maybe I should get papermill running to break up the notebooks?
  2. pop-tools

    • Kevin P asked me to give an update on this repo in next week's Xdev meeting (this came up while I was on the MOM call, so I'm not totally sure what he's expecting the update to look like :)
    • There's an Xdev mini-hack session to tackle low-hanging fruit tomorrow afternoon, I was going to try to fix #45 (get_grid() does not return something that can be written to netCDF)
    • After helping Frank get his new fill tool merged and then updating the tests, I'm starting to find my way around the code... hoping to keep that momentum going by trying to tackle the occasional issue ticket or PR

MOM Software Updates

  1. I've emailed Andrew S to try to set up a meeting later this week or next week to answer some questions
    • I'd like to finish up the surface_flux_compute() call, which still needs
      • Read some forcing fields from files
      • Apply computed surface fluxes in MOM
    • Once those questions are answered, I think interior_tendency_compute() will get implemented much faster
    • I also need to update the call to surface_flux_compute() so it's done once per task rather than column by column (don't need Andrew's help for this)
  2. Should I also use my branch to set up CESM-MOM to build / run cobalt? I was thinking this could be useful for the FEISTY work

August 11, 2020

General discussion

  1. Hi-res runs
    1. Slow progress
      • 1 mo per 7 wallclock hours is tedious (003 and 004 just finished Nov 0001), should up increase PE count?
      • Long queue waits are terrible; spending days in the queue to get 7 hours on the machine
    2. Diagnostics
      • Python package for development
      • I'm doing lots of infrastructure, Keith has started making plots
      • Current issue: binning ocn.log output by model day

No Meeting

  • July 28, 2020: Matt out of town

July 14, 2020

POP Software Updates

  1. High-res run
    • Have one month with two different output sets (one with a 5-day stream, one with most of those fields in monthly stream instead)
      • Can I do anything to help analyze this output? It's available on the CGD machine in /project/oce02/mlevy/high-res_BGC_1mo/
    • Looks like 0.68 SYPD including output, which is 124 simulated days per twelve hours [max cheyenne walltime]
      • rather than push the limits with a 4-month run, I'm thinking 3-month runs with 10 hour walltime?
      • 3-month runs means 264 job submissions to get through 66 years, which is 2x 5-year with different initial condtions then continuing one of them for the last 56 years
      • Any possibility of getting extension on the computer allocation? Even with a dedicated chunk of the machine there's not enough time to finish before September 30 (75-ish days remaining once Cheyenne maintenance period ends)
  2. Release update
    • Kristen's tuning updates are on master
    • For the high-res compset, I need to
      1. Move inputdata to correct location (currently in a tmp/ directory)
      2. Run aux_pop and aux_pop_MARBL
      3. Question: the 1-month test is using settings_latest+cocco.yaml; do we need a 1-month run with settings_latest.yaml before creating the compset?

General discussion

  1. Xdev update: we're trying to highlight issues in the backlog queue that should be easy fixes; two issues from pop-tools appear to fit the mold. Would it be useful if xdev tackled these in a hackathon next week:
    • #45: get_grid returns a file that cannot be written to netCDF (would use Keith's proposal from the most recent comment)
    • #49: non-default tol value not propagated through fill call tree

MARBL Software Updates

  1. marbl0.39.0 contains latest tunings (Kristen's 005 run)
  2. Nothing else in the pipeline for the CESM 2.2.0 release

MOM Software Updates

  1. I don't think I have much progress to report (with glade down I can't log in to see where things stand, but I've been focused on the CESM 2.2.0 freeze)

June 30, 2020

POP Software Updates

  1. Upcoming CESM 2.2 freeze: need to figure out order of POP tags
    1. New tunings
    2. JRA / BGC high-res run
    3. Qing's entrainment update
  2. Added complication, the entrainment update may be more than just round-off level changes
  3. My preferred path forward
    1. Qing's entrainment update but keep default scheme in place (langmuir_opt = 'vr12-ma')
    2. New tunings (need to verify that above PR doesn't require re-tuning: another cycle? Shorter run?)
    3. JRA / BGC high-res run
      • Hoping to run a couple of 1-month simulations (one using new compset out of the box, other configured for our experiment)
      • Do we need to figure out langmuir_opt first, or is that only going to affect the 1 degree?
    4. If langmuir_opt = 'lf17' is back on the table, it should be re-tested after the tuning update
      • Listed last only because I'm uncertain if it's necessary; can be done ahead of high-res compset (that may actually be preferable)
  4. CESM 2.1.4 release still needs a few things from POP
    1. Update namelist defaults to use cdf5 files rather than netcdf-4 files
    2. Update dt_count for SSP extension compsets
    3. I'm hoping to avoid thinking about these until after my MOM6 webinar talk in August (the first is actually ready to be merged, but the second may need a little more testing); I think the 2.1.4 code freeze won't happen until after the 2.2.0 release, but I'm not 100% certain about that.
    4. new ndep datasets for CAM (i.e., non-WACCM) SSP extension compsets (KL in charge of this)

MARBL Software Updates

  1. POP PR for updated tunings is waiting on corresponding MARBL PR
  2. Will also need to update the stable branch

MOM Software Updates

  1. Can run a full month with reasonable surface forcings (includes using T and S from the model physics) with correct surface values of tracers
  2. Starting to put together talk for MOM6 webinar, but mostly focused on CESM 2.2 release

General discussion


No Meeting

  • June 16, 2020: CESM Workshop

June 2, 2020

POP Software Updates

  1. Status for CESM 2.2 release

    • Plans call for several new tags

      1. WW3 entrainment update (Alper's responsibility)
      2. iron flux forcing bug (see below)
      3. new tunings for BGC (Kristen is waiting on iron flux forcing bug fix)
      4. new compset for high-res w/ BGC (see below; will include new tunings from Kristen but I have other aspects to attend to as well)
      5. bug in selecting dt_count default (POP issue #28)
    • Open PR: update iron flux forcing (tied to PR in marbl-forcing)

    • high-res compset

      • Current definition:

        <compset>
          <!-- latest JRA forcing, ecosys, high-res -->
          <alias>GIAFECO_JRA_HR</alias>
          <lname>2000_DATM%JRA-1p4-2018_SLND_CICE%CICE4_POP2%ECO_DROF%JRA-1p4-2018_SGLC_SWAV</lname>
        </compset>
      • The existing eco + interannual forcing compset is

        <compset>
          <!-- latest JRA forcing -->
          <alias>G1850ECOIAF_JRA</alias>
          <lname>1850_DATM%JRA-1p4-2018_SLND_CICE_POP2%ECO_DROF%JRA-1p4-2018_SGLC_WW3</lname>
        </compset>
        
      • I think that means we really want our compset to be

        <compset>
          <!-- latest JRA forcing, ecosys, high-res -->
          <alias>G1850ECOIAF_JRA_HR</alias>
          <lname>1850_DATM%JRA-1p4-2018_SLND_CICE%CICE4_POP2%ECO_DROF%JRA-1p4-2018_SGLC_SWAV</lname>
        </compset>
  2. JRA_HR w/ MARBL

    • using three autotrophs (no coccolithophores yet), seeing 0.77 SYPD (260k pe-hrs / simulated_year) on largest task count Alper provided:

      <decomp nproc="7507" res="tx0.1v3" >
        <maxblocks >1</maxblocks>
        <bsize_x   >25</bsize_x>
        <bsize_y   >32</bsize_y>
        <decomptype>spacecurve</decomptype>
      </decomp>
    • Number above is a single day run with no output (I also ran for two days to verify initialization isn't included).

    • Just got text file from him outlining other task counts to try

    • Do we have a target SYPD? CICE is running at 1.7 SYPD (23 nodes); would need to increase task count there as well to get any faster

MOM Software Updates

  1. Still struggling with two issues in surface flux forcing

    1. With surface tracer values set to 0, I tried to set forcing fields to the following (unlisted forcings set to 0):

      u10_sqr = 2.5e5
      atm_press = 1
      xco2 = 284.7
      xco2_alt_co2 = 284.7
      sss = 35

      But the run crashes during day 5; setting sss = 0 instead lets me run for a full month.

    2. Setting surface tracer values to "true" values (CS%tr(i,j,1,m)) causes run to crash in day 7 (assuming sss=0`)

General discussion

MARBL Software Updates


No Meeting

  • May 19, 2020: Matt unavailable

May 5, 2020

General discussion

  1. JRA high-res
    • Trying to track progress in real-time on Zulip
    • CICE and CIME pull requests handle a few minor issues in those components
    • Run is throwing errors in POP
      1. NaN in tracer tendencies was traced to bad copy of DZT with partial bottom cells
      2. something in PFTs? NaN in diazChl but really small values in all PFT fields... (see #50) -- few possibilities for fixing this to discuss:
        • Keith's suggestion (if so, can we remove PAR_threshold?)
        • Move PAR_threshold to settings file, and make it resolution-dependent?

MARBL Software Updates

POP Software Updates

MOM Software Updates

  1. Cleaned up configuration of MARBL per call with GFDL folks

  2. Pulled MARBL out of submodules

    • This caused Travis CI failures (building without access to MARBL), so I added the _USE_MARBL_TRACERS cpp
    • Currently, CESM interface always builds with -D_USE_MARBL_TRACERS; will put in logic to be smarter about that (and to only build MARBL itself) towards end of project
  3. Have registered diagnostics with the model, though I may need to be smarter about pointing out 2D variables?

    "ocean_model", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean zl:mean area:mean
    "ocean_model", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: zl:mean
    "ocean_model_z", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean z_l:mean area:mean
    "ocean_model_z", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: z_l:mean
    "ocean_model_rho2", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean rho2_l:mean area:mean
    "ocean_model_rho2", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: rho2_l:mean
    "ocean_model_d2", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean zl:mean area:mean
    "ocean_model_d2", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: zl:mean
    "ocean_model_z_d2", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean z_l:mean area:mean
    "ocean_model_z_d2", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: z_l:mean
    "ocean_model_rho2_d2", "ECOSYS_IFRAC"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction
        ! cell_methods: xh:mean yh:mean rho2_l:mean area:mean
    "ocean_model_rho2_d2", "ECOSYS_IFRAC_xyave"  [Unused]
        ! long_name: Ice Fraction for ecosys fluxes
        ! units: fraction

April 21, 2020

General discussion

  1. JRA high-res
    • Division of labor for forcing / initial condition files? I'm happy to take some pre-existing scripts for generating x1 files and modify them for use w/ 0.1 degree, but don't want to duplicate labor if others are already on it
  2. More glade corruption?
    • Mike Mills was running into file-system troubles that seemed reminiscent of errors I saw last month when testing merge of single-column MARBL branch in CESM
    • CISL has re-opened my original ticket (if that link doesn't work, perhaps this one will)

MARBL Software Updates

POP Software Updates

MOM Software Updates

  1. Tracking progress via github project board
    • To-do: break down these big tasks into many smaller tasks. E.g. instead of add MARBL output to history file, create issues for
      1. running MARBL python script to generate diag list
      2. adding MARBL list to diag_table
      3. modifying fortran in driver to accumulate desired diagnostics correctly
  2. Emailed Alistair and Andrew about bringing in MARBL as git submodule rather than using generic tracer

April 7, 2020

General discussion

  1. Hi-res + BGC
    • #25 has been merged (brings Keith's CESM 2.1 updates for JRA to POP master) and the tag is in the plans for cesm2_2_alpha04g; this will let us start out of cesm2_2_beta04 (need to add _HR compset, put together emissions dataset, etc)
    • Keith asked to discuss output for the project

MARBL Software Updates

  1. #338 has been merged.

    • Yay!

    • Three issues that were waiting for single-column test:

      1. #53: migrate k loop (mostly done, three or four more function calls)
      2. #176: loop to kmt instead of k
      3. #336: clean up stand-alone timer results

      I could see spending some time on #53 and / or #176, though my feeling is that it's a low priority. #336 is a wishlist issue item, not something that needs attention right now

POP Software Updates

  1. Not BGC related: I was asked to help create a new compset for extending SSP runs

    • Not too much work to put together PR #27
    • Testing uncovered issue #28

    I don't want to fall down this rabbit hole, though I'm probably now in a better spot to clean this up than Alper

MOM Software Updates

  1. I did generate initial conditions for tracers on the MOM grid last summer, and updated the slides from previous meeting accordingly
  2. Not much progress to report, but I really want to stop putting out small fires and attack the big fire

No Meetings

  • March 24, 2020
  • March 10, 2020 (No urgent need to meet, other projects taking precedence)
  • February 25, 2020 (CGD Town Hall)

February 11, 2020

General discussion

MARBL Software Updates

  1. #338 is the stand-alone test of the compute() functions, just needs more documentation
    • Read through user guide, make sure it is all up-to-date with examples from the stand-alone driver (I think I finished this section last fall)
    • Link to a page with details of POP's saved state implementation from the general saved state page?
    • unit testing: need more detail on what the tests are doing
    • Write up regression testing page

POP Software Updates

MOM Software Updates

  1. Putting together slides on the process
    • Still waiting on #338 before making more progress

January 28, 2020

General discussion

  1. More talk about CESM2 papers
    • Updated nutrient plots (using different region mask for zonal means)
    • xpersist: caching data in /glade/p/cgd/oce/projects/cesm2-marbl
  2. Geocat hack-a-thon

MARBL Software Updates

POP Software Updates

MOM Software Updates


January 14, 2020

General discussion

  1. Progress on tables / plots for CESM2 papers
    • cesm2-marbl
      1. Using xpersist to store time series of global averages in flux table (will also be used for time series plots)
      2. Nutrient plots need to use the pop-tools region mask
    • Keith update?
  2. Working with Precious to get him using intake-esm for LENS study
  3. Been talking to Matt about how to turn the cesm2-marbl repository into a more general analysis package (or packages)

MARBL Software Updates

POP Software Updates

MOM Software Updates