-
Notifications
You must be signed in to change notification settings - Fork 26
2020 MARBL Dev team meetings
- Hi-res run
- Keith and I decided to put run output in
/glade/campaign/cesm/development/bgcwg/projects/hi-res_JRA
- Still need to reach out to Gary S about best way to move / reshape / compress data
-
004
is through August 0005, but we're back to waiting in the queue - Update on analysis tools? I'm a little behind
- Keith and I decided to put run output in
-
pop-tools
- Lots of discussion on zulip re: budget
- I could reach out to Riley and Anna-Lena, but haven't yet
- I opened PRs for both NCAR/MOM6 and ESCOMP/MOM_interface
- Former is to show Andrew S where things stand, latter is to keep the NCAR MOM6 devs in the loop
- Neither is ready to be merged yet
- Driver progress:
- Almost done at first pass of loading surface flux forcing fields
- Still need to add saved state to restart files
- Second pass clean-up (I think I'll do this after getting the call to
interior_tendency_compute()
)- call
surface_flux_compute()
for multiple columns instead of one at a time - back-up options for forcing fields (some can come from namelist or coupler, others from coupler or file; hard-coding in primary option first)
- call
-
Hi-res runs
-
Progress report
- Increased node count, getting 3 months in a little over 8 hours of wallclock (should I push my luck and go for 4 months / 12 hours?)
-
003
is through June 0002,004
is through May 0002
-
Permanent location for output?
-
Each run is using 11 TB of scratch space; ~5 TB for history (including CICE) and the rest are restarts
- POP history files reach 200 TB total
- CICE history files will be another 28 TB
- January 1st restarts are 350 GB, 1st of other months are 429 GB (due to POP annual stream; once we add 5-day output that'll affect some months as well)
-
I only have 20 TB free on scratch
-
Does
/glade/campaign/cesm
make sense for it?Space Used Quota % Full # Files --------------------------------------- ----------- ----------- --------- ----------- /glade/campaign/collections/cmip/CMIP6 3016.69 TB 4096.00 TB 73.65 % 5871031 /glade/campaign/cesm 4300.58 TB 5120.00 TB 84.00 % 8123020 /glade/campaign/cgd/oce 444.54 TB 550.00 TB 80.82 % 1141773
-
-
Diagnostics
- I submitted a PR to improve testing: encourages users to setup
pre-commit
to runblack
; adds Github Actions forblack
andpytest
- Keith is working on a PR to add more plots: he's pointed out that the notebooks are getting extremely large, maybe I should get
papermill
running to break up the notebooks?
- I submitted a PR to improve testing: encourages users to setup
-
-
pop-tools
- Kevin P asked me to give an update on this repo in next week's Xdev meeting (this came up while I was on the MOM call, so I'm not totally sure what he's expecting the update to look like :)
- There's an Xdev mini-hack session to tackle low-hanging fruit tomorrow afternoon, I was going to try to fix #45 (
get_grid()
does not return something that can be written to netCDF) - After helping Frank get his new fill tool merged and then updating the tests, I'm starting to find my way around the code... hoping to keep that momentum going by trying to tackle the occasional issue ticket or PR
- I've emailed Andrew S to try to set up a meeting later this week or next week to answer some questions
- I'd like to finish up the
surface_flux_compute()
call, which still needs- Read some forcing fields from files
- Apply computed surface fluxes in MOM
- Once those questions are answered, I think
interior_tendency_compute()
will get implemented much faster - I also need to update the call to
surface_flux_compute()
so it's done once per task rather than column by column (don't need Andrew's help for this)
- I'd like to finish up the
- Should I also use my branch to set up CESM-MOM to build / run cobalt? I was thinking this could be useful for the FEISTY work
- Hi-res runs
- Slow progress
- 1 mo per 7 wallclock hours is tedious (
003
and004
just finished Nov 0001), should up increase PE count? - Long queue waits are terrible; spending days in the queue to get 7 hours on the machine
- 1 mo per 7 wallclock hours is tedious (
- Diagnostics
- Python package for development
- I'm doing lots of infrastructure, Keith has started making plots
- Current issue: binning
ocn.log
output by model day
- Slow progress
- July 28, 2020: Matt out of town
- High-res run
- Have one month with two different output sets (one with a 5-day stream, one with most of those fields in monthly stream instead)
- Can I do anything to help analyze this output? It's available on the CGD machine in
/project/oce02/mlevy/high-res_BGC_1mo/
- Can I do anything to help analyze this output? It's available on the CGD machine in
- Looks like 0.68 SYPD including output, which is 124 simulated days per twelve hours [max cheyenne walltime]
- rather than push the limits with a 4-month run, I'm thinking 3-month runs with 10 hour walltime?
- 3-month runs means 264 job submissions to get through 66 years, which is 2x 5-year with different initial condtions then continuing one of them for the last 56 years
- Any possibility of getting extension on the computer allocation? Even with a dedicated chunk of the machine there's not enough time to finish before September 30 (75-ish days remaining once Cheyenne maintenance period ends)
- Have one month with two different output sets (one with a 5-day stream, one with most of those fields in monthly stream instead)
- Release update
- Kristen's tuning updates are on
master
- For the high-res compset, I need to
- Move inputdata to correct location (currently in a
tmp/
directory) - Run
aux_pop
andaux_pop_MARBL
-
Question: the 1-month test is using
settings_latest+cocco.yaml
; do we need a 1-month run withsettings_latest.yaml
before creating the compset?
- Move inputdata to correct location (currently in a
- Kristen's tuning updates are on
- Xdev update: we're trying to highlight issues in the backlog queue that should be easy fixes; two issues from
pop-tools
appear to fit the mold. Would it be useful if xdev tackled these in a hackathon next week:-
#45:
get_grid
returns a file that cannot be written to netCDF (would use Keith's proposal from the most recent comment) -
#49: non-default
tol
value not propagated through fill call tree
-
#45:
-
marbl0.39.0 contains latest tunings (Kristen's
005
run) - Nothing else in the pipeline for the CESM 2.2.0 release
- I don't think I have much progress to report (with
glade
down I can't log in to see where things stand, but I've been focused on the CESM 2.2.0 freeze)
- Upcoming CESM 2.2 freeze: need to figure out order of POP tags
- Added complication, the entrainment update may be more than just round-off level changes
- My preferred path forward
-
Qing's entrainment update but keep default scheme in place (
langmuir_opt = 'vr12-ma'
) - New tunings (need to verify that above PR doesn't require re-tuning: another cycle? Shorter run?)
-
JRA / BGC high-res run
- Hoping to run a couple of 1-month simulations (one using new compset out of the box, other configured for our experiment)
- Do we need to figure out
langmuir_opt
first, or is that only going to affect the 1 degree?
- If
langmuir_opt = 'lf17'
is back on the table, it should be re-tested after the tuning update- Listed last only because I'm uncertain if it's necessary; can be done ahead of high-res compset (that may actually be preferable)
-
Qing's entrainment update but keep default scheme in place (
- CESM 2.1.4 release still needs a few things from POP
- Update namelist defaults to use cdf5 files rather than netcdf-4 files
- Update dt_count for SSP extension compsets
- I'm hoping to avoid thinking about these until after my MOM6 webinar talk in August (the first is actually ready to be merged, but the second may need a little more testing); I think the 2.1.4 code freeze won't happen until after the 2.2.0 release, but I'm not 100% certain about that.
- new ndep datasets for CAM (i.e., non-WACCM) SSP extension compsets (KL in charge of this)
- POP PR for updated tunings is waiting on corresponding MARBL PR
- Will also need to update the stable branch
- Can run a full month with reasonable surface forcings (includes using
T
andS
from the model physics) with correct surface values of tracers - Starting to put together talk for MOM6 webinar, but mostly focused on CESM 2.2 release
- June 16, 2020: CESM Workshop
-
Status for CESM 2.2 release
-
Plans call for several new tags
- WW3 entrainment update (Alper's responsibility)
- iron flux forcing bug (see below)
- new tunings for BGC (Kristen is waiting on iron flux forcing bug fix)
- new compset for high-res w/ BGC (see below; will include new tunings from Kristen but I have other aspects to attend to as well)
- bug in selecting
dt_count
default (POP issue #28)
-
Open PR: update iron flux forcing (tied to PR in marbl-forcing)
-
-
Current definition:
<compset> <!-- latest JRA forcing, ecosys, high-res --> <alias>GIAFECO_JRA_HR</alias> <lname>2000_DATM%JRA-1p4-2018_SLND_CICE%CICE4_POP2%ECO_DROF%JRA-1p4-2018_SGLC_SWAV</lname> </compset>
-
The existing eco + interannual forcing compset is
<compset> <!-- latest JRA forcing --> <alias>G1850ECOIAF_JRA</alias> <lname>1850_DATM%JRA-1p4-2018_SLND_CICE_POP2%ECO_DROF%JRA-1p4-2018_SGLC_WW3</lname> </compset>
-
I think that means we really want our compset to be
<compset> <!-- latest JRA forcing, ecosys, high-res --> <alias>G1850ECOIAF_JRA_HR</alias> <lname>1850_DATM%JRA-1p4-2018_SLND_CICE%CICE4_POP2%ECO_DROF%JRA-1p4-2018_SGLC_SWAV</lname> </compset>
-
-
-
JRA_HR
w/ MARBL-
using three autotrophs (no coccolithophores yet), seeing 0.77 SYPD (260k pe-hrs / simulated_year) on largest task count Alper provided:
<decomp nproc="7507" res="tx0.1v3" > <maxblocks >1</maxblocks> <bsize_x >25</bsize_x> <bsize_y >32</bsize_y> <decomptype>spacecurve</decomptype> </decomp>
-
Number above is a single day run with no output (I also ran for two days to verify initialization isn't included).
-
Just got text file from him outlining other task counts to try
-
Do we have a target SYPD? CICE is running at 1.7 SYPD (23 nodes); would need to increase task count there as well to get any faster
-
-
Still struggling with two issues in surface flux forcing
-
With surface tracer values set to 0, I tried to set forcing fields to the following (unlisted forcings set to 0):
u10_sqr = 2.5e5 atm_press = 1 xco2 = 284.7 xco2_alt_co2 = 284.7 sss = 35
But the run crashes during day 5; setting
sss = 0
instead lets me run for a full month. -
Setting surface tracer values to "true" values (
CS%tr(i,j,1,m)
) causes run to crash in day 7 (assumingsss=0`
)
-
- May 19, 2020: Matt unavailable
- JRA high-res
- Trying to track progress in real-time on Zulip
- CICE and CIME pull requests handle a few minor issues in those components
- Run is throwing errors in POP
-
NaN
in tracer tendencies was traced to bad copy ofDZT
with partial bottom cells - something in PFTs?
NaN
indiazChl
but really small values in all PFT fields... (see #50) -- few possibilities for fixing this to discuss:- Keith's suggestion (if so, can we remove
PAR_threshold
?) - Move
PAR_threshold
to settings file, and make it resolution-dependent?
- Keith's suggestion (if so, can we remove
-
-
Cleaned up configuration of MARBL per call with GFDL folks
-
Pulled MARBL out of submodules
- This caused Travis CI failures (building without access to MARBL), so I added the
_USE_MARBL_TRACERS
cpp - Currently, CESM interface always builds with
-D_USE_MARBL_TRACERS
; will put in logic to be smarter about that (and to only build MARBL itself) towards end of project
- This caused Travis CI failures (building without access to MARBL), so I added the
-
Have registered diagnostics with the model, though I may need to be smarter about pointing out 2D variables?
"ocean_model", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean zl:mean area:mean "ocean_model", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: zl:mean "ocean_model_z", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean z_l:mean area:mean "ocean_model_z", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: z_l:mean "ocean_model_rho2", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean rho2_l:mean area:mean "ocean_model_rho2", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: rho2_l:mean "ocean_model_d2", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean zl:mean area:mean "ocean_model_d2", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: zl:mean "ocean_model_z_d2", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean z_l:mean area:mean "ocean_model_z_d2", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: z_l:mean "ocean_model_rho2_d2", "ECOSYS_IFRAC" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction ! cell_methods: xh:mean yh:mean rho2_l:mean area:mean "ocean_model_rho2_d2", "ECOSYS_IFRAC_xyave" [Unused] ! long_name: Ice Fraction for ecosys fluxes ! units: fraction
- JRA high-res
- Division of labor for forcing / initial condition files? I'm happy to take some pre-existing scripts for generating x1 files and modify them for use w/ 0.1 degree, but don't want to duplicate labor if others are already on it
- More glade corruption?
- Mike Mills was running into file-system troubles that seemed reminiscent of errors I saw last month when testing merge of single-column MARBL branch in CESM
- CISL has re-opened my original ticket (if that link doesn't work, perhaps this one will)
- Tracking progress via github project board
- To-do: break down these big tasks into many smaller tasks. E.g. instead of add MARBL output to history file, create issues for
- running MARBL python script to generate diag list
- adding MARBL list to diag_table
- modifying fortran in driver to accumulate desired diagnostics correctly
- To-do: break down these big tasks into many smaller tasks. E.g. instead of add MARBL output to history file, create issues for
- Emailed Alistair and Andrew about bringing in MARBL as git submodule rather than using generic tracer
- Hi-res + BGC
-
#25 has been merged (brings Keith's CESM 2.1 updates for JRA to POP master) and the tag is in the plans for
cesm2_2_alpha04g
; this will let us start out ofcesm2_2_beta04
(need to add_HR
compset, put together emissions dataset, etc) - Keith asked to discuss output for the project
-
#25 has been merged (brings Keith's CESM 2.1 updates for JRA to POP master) and the tag is in the plans for
-
#338 has been merged.
-
Yay!
-
Three issues that were waiting for single-column test:
-
#53: migrate
k
loop (mostly done, three or four more function calls) -
#176: loop to
kmt
instead ofk
- #336: clean up stand-alone timer results
I could see spending some time on #53 and / or #176, though my feeling is that it's a low priority. #336 is a wishlist issue item, not something that needs attention right now
-
#53: migrate
-
-
Not BGC related: I was asked to help create a new compset for extending SSP runs
I don't want to fall down this rabbit hole, though I'm probably now in a better spot to clean this up than Alper
- I did generate initial conditions for tracers on the MOM grid last summer, and updated the slides from previous meeting accordingly
- Not much progress to report, but I really want to stop putting out small fires and attack the big fire
- March 24, 2020
- March 10, 2020 (No urgent need to meet, other projects taking precedence)
- February 25, 2020 (CGD Town Hall)
-
#338 is the stand-alone test of the
compute()
functions, just needs more documentation- Read through user guide, make sure it is all up-to-date with examples from the stand-alone driver (I think I finished this section last fall)
- Link to a page with details of POP's saved state implementation from the general saved state page?
- unit testing: need more detail on what the tests are doing
- Write up regression testing page
- More talk about CESM2 papers
- Updated nutrient plots (using different region mask for zonal means)
-
xpersist
: caching data in/glade/p/cgd/oce/projects/cesm2-marbl
- Geocat hack-a-thon
- Produce python notebooks to mimic popular examples from http://ncl.ucar.edu
- Lots of plots here (code in NCAR/GeoCAT-examples)
- Progress on tables / plots for CESM2 papers
-
cesm2-marbl
- Using
xpersist
to store time series of global averages in flux table (will also be used for time series plots) - Nutrient plots need to use the
pop-tools
region mask
- Using
- Keith update?
-
cesm2-marbl
- Working with Precious to get him using
intake-esm
for LENS study - Been talking to Matt about how to turn the cesm2-marbl repository into a more general analysis package (or packages)