History Refactor Wish-list #630

rgknox · 2020-04-09T21:24:30Z

There are a few things that bother me about history output.

It is both not clear to the user, and not implemented with consistency, how we treat non-fates columns in the upscaling to grid-cell level output. All of our output in the history file, ultimately gets upscaled to the grid-scale. This issue is roughly the same as change history flushing behavior #535, because how we flush values on non-fates columns essentially controls how/if those values are part of the final averaging. In some cases we tell non fates columns to be zero (adds a zero to the average), and in some cases we tell it to use an ignore flag.

My position is that if the output variable is on the fates side of the code, the diagnostic is inherently relevant only to naturally vegetated columns within FATES, and all other columns should be ignored. If we want a diagnostic that uses zeros from other columns, or real information from other columns, this should be defined on the host-model side, and FATES should pass the boundary conditions necessary to make the calculation.

Get rid of these patch-scale diagnostic types, (ie _PA). These are confusing because it looks like we can have patch-scale output, but what is really happening is that we are just asking CLM/ELM to upscale from patches to grid for us. We already do this, and having two ways of going about the same thing is confusing.
The filling of all history diagnostics happens in 1 central location, FatesHistoryInterfaceMod. This makes things complicated, particularly because the daily calls happen after all of our dynamics completes. If we want a diagnostic on growth or mortality rates (for example) , we have to then upscale the growth rates and save them to a site level intermediary variable, so that the information does not get messed-up during disturbance. Another example, if a cohort becomes terminated, we can't diagnose its rates because it ceases to exist. Long story short, if we were allowed to immediately update the history pointer arrays that are defined in FatesHistoryInterfaceMod when the salient variable is meaningfully updated, we would be able to remove all of these site-level intermediary arrays.

rgknox · 2020-04-22T00:16:30Z

Also:

All history diagnostic written by FATES, should have a "FATES_" namespace prefix to the variable.

No output variables should have a "COL" or "PA" to indicate if it is patch or column, because the refactor will remove patch level averaging, everything is implicitly a calculation of fates columns.

EDIT (9/20): As far as suffixes go, all variables will be implicitly "site" level, and thus for consistency sake, we should either include "_si" suffix for all variables or not. If a variable has more than one dimension, such as size-class for instance, we would append si onto the existing suffix, whereas _sc -> _sisc, or _scpf -> _siscpf. I'm in favor of just dropping the "_si" since it is now implied.

ckoven · 2020-04-24T17:19:07Z

@rgknox per your comment here about the _pa history variables. about half of these are fire-related. I am going through some of the fire history variables right now and will add the task of moving all of those current indexed directly by patch to the standard way we handle site-level variables to that effort.

glemieux · 2020-05-13T21:05:35Z

Get rid of these patch-scale diagnostic types, (ie _PA). These are confusing because it looks like we can have patch-scale output, but what is really happening is that we are just asking CLM/ELM to upscale from patches to grid for us. We already do this, and having two ways of going about the same thing is confusing.

Noting here that #640 and #642 have taken care of this particular item.

rgknox · 2020-09-18T14:53:03Z

To do: Review the variable list in both E3SM and CTSM tests.

In CTSM we have a test called "AllVars" which sets the flag hist_empty_htapes = .false. When this flag is set to false, all variables in the history that are defined with use_default='active" are included. But we have many variables that are set 'inactive" because they have a large second dimension (e.g. everything with _scpf). This list of inactive variables needs to be updated (periodically) in testmods_dirs/clm/FatesAllVars/user_nl_clm

rgknox · 2020-09-29T19:13:45Z

If we do not fix the FATES namespacing of history variables, we at least have to make sure we are not using the same diagnostics that CLM/ELM are using. "NEP" for instance is part of the starndard output for ELM/CLM.

rgknox · 2020-10-13T16:04:07Z

Units on several logging history variables are wrong or incorrectly constructed:

See here, the cohort level mortality terms are fractions that died for that day. But the diagnostic has temporal period of per year:
https://github.com/NGEET/fates/blob/master/main/FatesHistoryInterfaceMod.F90#L2437

See here, logging mortality terms (fractions died that day) are being multiplied by seconds_per_year. Same as previous issue, need to correct time units.
https://github.com/NGEET/fates/blob/master/main/FatesHistoryInterfaceMod.F90#L2548-L2549

jkshuman · 2020-10-13T16:17:10Z

@KatieMurenbeeld tagging you to take note of these issues with logging output

ckoven · 2021-02-05T18:40:25Z

just a note: I wanted to try to put all of our current history variables into a spreadsheet to better assess how to make all of our history variables more uniform in terms of units, namespacing, also what is there and not there, which ones are on by default, etc. After some emacs-wrangling, I turned the relevant fortran into this:
https://docs.google.com/spreadsheets/d/1gLhbJV45-Ptd65iI9W2LZtZj6JcEJSTxaicYfpGBiaM/edit#gid=0

should be useful as a reference and starting point for a more systematic overhaul.

adrifoster · 2021-09-09T21:04:28Z

Just commenting here to note that per a conversation at a CTSM SE meeting and a follow-up conversation with @rgknox I am tackling some of these history variable update requests. Right now I am going through and updating all the output variable names to have 'FATES_' as a prefix, and also renaming them to have suffixes according to their scale (see below). We also discussed changing the flushvar to be all the hlm_hio_ignore_val, so @rgknox and I discussed needing to go through where the history variables are actually counted/summed/etc. and making sure that things are 0'd where they need to be.

Suffixes:
_SI: site-level (should we still do this?, I see a note above that says this should be implicit - I am not including _SI on
variables that are averaged at a finer scale (e.g. GPP_PF, not GPP_SIPF)
_PF: PFT
_SZ: size
_HT: height
_AP: patch/disturbance age
_AC: cohort/plant age
_FC: fuel class
_EL: element
_CL: canopy layer
_LL: leaf layer
_DC: coarse woody debris size
_SL: soil layer

adrifoster · 2021-09-09T21:13:41Z

Also, in going through this I am noticing that it is fairly tedious to update all these calls to set_history_var, and there are a few typos/incorrect units/etc. that I am already encountering.

I was thinking it might be easier to have all of these in some sort of easily read and manipulated input file (csv, etc.?) that we would then parse inside the define_history_vars subroutine. We could then just call the set_history_var in a loop using the list/array of these input arguments (i.e. variable name, units, longname, etc.), that get read in.

e.g.:

! Some code here that reads in and parses input file with variable names, etc.
!....
!....
!....

do i = 1, n_history_vars
    call this%set_history_var(vname=config(i)%varname, units=config(i)%units, &
         long=config(i)%longname, use_default=config(i)%use_default, &
         avgflag=config(i)%avflag, vtype=config(i)%vtype, hlms=config(i)%hlms, &
         flushval=hlm_hio_ignore_val, upfreq=config(i)%upfreq, ivar=i, &
         initialize=initialize_variables, index=hash_index)

end do

It may work to have some kind of hash map so we can still index the variable locations.

Do we think this would be more manageable for future changes and further variable updates?

ckoven · 2021-09-09T22:48:11Z

@adrifoster thanks for doing this. It's an interesting idea and certainly the current code has a lot of repetition. I guess I see there being some tradeoffs with the proposal of moving this to some sort of csv file. First would be that we would no longer have the metadata and data (or I guess more specifically what the calculation of each variable is) together in one file as we do now, which would then make it trickier to search within that one file to see what a given variable actually is and how it is calculated. A second issue is if you were to simultaneously do both the updates that you are making and a switch to some other file, then we'd lose the git continuity and provenance to be able to keep track of things like changing units to see what all the changes are. A third and possibly biggest one though is related to @rgknox's point 3 in #630 (comment) -- ideally we want to go more in the direction of how CLM & ELM handle this stuff and try to move the history writes from the current monolithic block of code that is all called at the end of the timestep to a set of more modular history update subroutines that would be called at different times, and I feel like pulling out the variable definitions seems like it would be in conflict with that goal?

ekluzek · 2021-09-09T22:50:00Z

@adrifoster the idea of having an external file to read in for history variables is something that I've thought about as well and felt it would be a good way to do for CTSM. So I totally agree with this approach and if you adopt it, I hope it'll make it easier to bring into CTSM. I was thinking it should be something like XML as I think csv doesn't handle enough complexity. You could also have it in XML for example though and have build scripts that convert to a different format as needed by whatever needs to read it in. The XML format would for example help out master history list out. But, for the FORTRAN it might be best to have that simple csv file.

I might have some notes on this idea in a CTSM issue so I'll look for that.

rgknox · 2021-09-10T13:52:28Z

@adrifoster I'm trying to envision the code that connects the text string to the actual named integer that is used index for each variable. Since the named integers, (i.e. ih_npp_leaf_si, etc) live in the module itself, won't it get messy trying to maintain these definitions in two places, in the module and in a csv file? Also, I'm just not sure how you can take a string and use that to choose a named variable (I suppose it can be done, just not sure how). The alternative is to do away with the named integers altogether, and use the variable's text name to identify the variable location in memory, however that would slow down the model, because we would need to do text searches every time we want to fill a variable.

rgknox · 2021-09-10T14:50:11Z

I'm also thinking that we don't really need the following 3 arguments to all of those set_history_var() calls:
flushval: this will soon be implicitly set to ignore in all cases
ivar, initialize: I think the routine is called at the machine level, and not the thread level, so we can use scalar globals for these

keep?:
avgflag: Right now, we don't seem to do anything other than average, but there is no reason we couldn't use the other average types in the future (maxes and mins, right?), I see no reason not to continue including this

Another to consider:
upfreq: If we removed this argument, we would be changing much of how we do things. This argument only controls when we flush the output arrays. Some we want to flush at the model timestep prior to filling (photosynthesis,hydraulics), some we want to flush daily prior to filling (dynamics, etc..upfreq=1). So at the model timestep, we loop through all variables and flush the ones with upfreq=2, and at the daily timestep we loop through and flush everything with upfreq=1. An alternative would be to have a generic flush routine, that when called, has an array of the history indices passed as an argument. I'm on the fence over which system I like better. The latter option is a little more verbose, which is good in that it is more obvious what is being flushed, when.

adrifoster · 2021-09-10T15:49:40Z

@adrifoster I'm trying to envision the code that connects the text string to the actual named integer that is used index for each variable. Since the named integers, (i.e. ih_npp_leaf_si, etc) live in the module itself, won't it get messy trying to maintain these definitions in two places, in the module and in a csv file? Also, I'm just not sure how you can take a string and use that to choose a named variable (I suppose it can be done, just not sure how). The alternative is to do away with the named integers altogether, and use the variable's text name to identify the variable location in memory, however that would slow down the model, because we would need to do text searches every time we want to fill a variable.

@rgknox Yes, I think we would need to redesign the code a bit. I would suggest using a hash map so that the current index would turn into the key. Though perhaps this would introduce more fragility because we would need to have the correct key string in the code.

adrifoster · 2021-09-10T15:57:57Z

A second issue is if you were to simultaneously do both the updates that you are making and a switch to some other file, then we'd lose the git continuity and provenance to be able to keep track of things like changing units to see what all the changes are.

@ckoven that makes sense, I am forging ahead with making these variable name/units updates and we can decide if we want to just push that first and come back to the potential redesign later, or do it all at once.

ekluzek · 2021-09-10T17:23:34Z

@rgknox I don't know of a way to turn the variable name into a string -- unless you have some code preprocessing happen, which could be done. We do that in some cases with files with names *.F90.in. But, I actually don't think that's the way to go here.

What I think you do is you separate the setting of the variable with the name, from the metadata surrounding it (the long name, units, etc.). So in one place you read the metadata from the input file and read it into based on the history fieldname. Elsewhere you assign the history fieldname to the array that contains the data.

I'll illustrate how this would work for CTSM, since I'm more familiar with it.

A current history call looks like this...

          call hist_addfld (fname='GRAINC_TO_FOOD', units='gC/m^2/s', &
               avgflag='A', long_name='grain C to food', &
               ptr_patch=this%grainc_to_food_patch)

Changing it would split it up so the call to assign the pointer would be like this...

          call hist_addfld (fname='GRAINC_TO_FOOD',  ptr_patch=this%grainc_to_food_patch)

There would be another line that reads in the meta-data file that contains the attributes for GRAINC_TO_FOOD. You would also have error checking to make sure all fields that have a "hist_addfld" call also have meta-data associated with it, and vice-versa.

I don't know that a hash helps here. You do need the list of fieldnames to be unique, and think it's OK to use character comparisons to go through the list of fields. There are hundreds of fields, but not thousands or millions where having a hash would be critically important.

@adrifoster does that picture work for the kind of thing you are envisioning?

rgknox · 2021-09-10T17:42:26Z

That makes sense @ekluzek
I don't know if I'm a fan of splitting it up though. I like the original method because everything that describes the variable, name, units, and data pointer, are in one place.
I think I'm more in favor of us trying to stream-line the way we currently do things, like removing unecessary arguments and clutter.

dlawrenncar · 2021-09-10T18:12:27Z

Back to a point about whether or not to maintain averaging options, I would definitely support maintaining that capability, which it seems like is the way you are going. Additionally, for those that are not aware, in CTSM we have brought in a new capability to get data (and average) for a particular local solar time (like leaf water potential at 8am). See below CTSM issue. This may not fit into this particular rework of FATES history output, but this seems like a feature that would also be quite useful for FATES. ESCOMP/CTSM#1374

…

On Fri, Sep 10, 2021 at 11:42 AM Ryan Knox ***@***.***> wrote: That makes sense @ekluzek <https://github.com/ekluzek> I don't know if I'm a fan of splitting it up though. I like the original method because everything that describes the variable, name, units, and data pointer, are in one place. I think I'm more in favor of us trying to stream-line the way we currently do things, like removing unecessary arguments and clutter. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#630 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFABYVBHWFUSEKFJMFGW7TDUBI7RBANCNFSM4MFBGCCA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

walkeranthonyp · 2021-09-13T12:55:36Z

This is great @adrifoster , thanks.

I've also noticed quite a bit of inconsistency in units. Both in how they are labeled (the same units have different character strings for different variables) and in the units themselves for the same variable types. e.g. fluxes can be in per second or per day and both fluxes and pools can be in per m2 or per ha. Maybe something to bring some consistency to too if you're taking the time to overhaul the history code.

ckoven · 2021-09-13T15:49:02Z

@walkeranthonyp that is a really good point and one of things that we had discussed a while ago. I had started to put all the current variables together in this spreadsheet that I linked to earlier in the thread (#630 (comment)) for precisely this purpose, but ran into a conceptual roadblock that I didn't know how to deal with. Ideally we should be using the standard CMIP conventions for units (so everything in kg, m, s, etc; and just the number 1 for dimensionless quantities like LAI). But I think that some of the more forestry-oriented or population-focused variables are in conflict with that goal. E.g. there is no SI unit for individual trees, so do we report plant populations in units of "m-2"? or report basal area in units of "1"? We probably need some discussion on this point.

rgknox · 2021-09-13T17:54:47Z

My sense is that people who think about dimensions and their standardization for a living, may give me a slap on the wrist for the following, but... The dimensions appear to cancel out, but should they really? It is m2 of one quantity (leaf) per m2 of another quantity (ground), same concept with BA. It seems most appropriate to describe as m2 / m2.

walkeranthonyp · 2021-09-14T15:05:11Z

I see what you're saying Ryan. I'm also happy with dimensionless units but don't feel strongly one way or another. Just so long as we're consistent.

My main concern is the per meter squared is represented at least three ways /m2, /m^2, and m-2. I think udunits uses the later and is my preference. Also we should probably convert hectare to m2 and have fluxes over consistent time steps.

adrifoster · 2021-09-14T15:11:30Z

I agree with @rgknox on the basal area front, normally (in forestry, etc. lit) I see those reported in units of cm2/m2 or m2/ha, and since I think we do want to standardize our per area to m2, I think it is fine to say the units are m2/m2. @ckoven I think for numbers most forestry applications do "stems/m2", but I have been writing "nindivs/m2" since that seemed to be used most often in the code already.

@walkeranthonyp, @rgknox and I had also discussed standardizing the units (area always in m2, mass always in g, etc.). I will be doing this as well as I go through the file.

As for the exponent/etc, I also agree that should be standardized. So far I have been changing everything to /m2 , but should we do m-2 instead?

ckoven · 2021-09-14T15:14:48Z

I think there are two slightly separate unit issues here: one is consistency and standardization in how we report units, and the other is consistency and standardization in how we calculate things. So for the second one, right now we have area in both m2 and ha, we have time in sec, day, and year, we have mass in g and kg. But for the first one, as Anthony says, we report things inconsistently, and in general make use of nonstandard conventions. e.g. we report individuals variously as N, indiv, and plants; in reality none of those are meaningful units from a convention standpoint and if we adhere to those standards than we should omit any reference to population in the units at all (closest example I could find are aerosol number densities which are just in m-2 here: https://cfconventions.org/Data/cf-standard-names/60/build/cf-standard-name-table.html)

So I see a few possibilities:

we adhere to the CF conventions as closely as possible
we chooses some other convention and adhere to it
we adhere mathematically to the CF or other convention but leave more detailed info in the units (e..g we keep a reference to population, we say m2 of what per m2 of what, etc)
we adhere to CF or other convention strictly, and keep a more verbose description of units in the long name

ckoven · 2021-09-14T15:20:02Z

n.b. -- in looking quickly through the CF conventions in the link above, i already see they do some things differently than we do. e.g. they want burned area in actual area (which requires knowing the size of the gridcell) or alternately in burned fraction but without per-time units, so that means that we'd need to integrate burned fraction over the history interval rather than averaging in time as we do now...

ckoven · 2021-09-14T16:02:38Z

PS the burned fraction solution is easy if we want to follow the standard, we just switch the avgflag to sum instead of average...

mpaiao · 2021-09-14T16:11:52Z

If it's easy to get the sum, then it makes more sense to report burned area fraction. Otherwise it could be confusing with the fire front ROS.

ckoven · 2021-09-16T19:53:44Z

On the FATES call today, we had good discussion of this topic and made some decisions via zoom polls about how to proceed on a few issues:

We decided to use the CMIP conventions (https://cfconventions.org/) as the basis of what we do, so report unit denominators in, e.g., m-2 versus /m2 or m^-2, and that in general to use, e.g., kg for mass, m for length, and s for time
We decided to allow for a specific carve-out to the use of s-1 for demographic rates to be reported in y-1 because those are closer to their generally-used units. And another specific carve-out for GDD in C d rather than K s. But otherwise we want to keep such carve-outs as limited as possible, and keep to the convention.
We decided to keep the description in the units field concise, and to put a more verbose and unambiguous descriptions in the long name field where needed. So, e.g. we will report populations as m-2 rather than something like nindivs m-2, but then specify in the long name that it is population numbers per square meter. Or, e.g., refer to aboveground biomass as in kg m-2 but then say in the long name something like Aboveground biomass in kg of carbon per m2 land area.
We decided to allow for reporting unitless quantities in, e.g., m2 m-2 rather than the CMIP convention of 1.

Thanks to all who participated!

adrifoster · 2021-09-27T16:29:23Z

Just commenting here to note that per a conversation at a CTSM SE meeting and a follow-up conversation with @rgknox I am tackling some of these history variable update requests. Right now I am going through and updating all the output variable names to have 'FATES_' as a prefix, and also renaming them to have suffixes according to their scale (see below). We also discussed changing the flushvar to be all the hlm_hio_ignore_val, so @rgknox and I discussed needing to go through where the history variables are actually counted/summed/etc. and making sure that things are 0'd where they need to be.

Suffixes:
_SI: site-level (should we still do this?, I see a note above that says this should be implicit - I am not including _SI on
variables that are averaged at a finer scale (e.g. GPP_PF, not GPP_SIPF)
_PF: PFT
_SZ: size
_HT: height
_AP: patch/disturbance age
_AC: cohort/plant age
_FC: fuel class
_EL: element
_CL: canopy layer
_LL: leaf layer
_DC: coarse woody debris size
_SL: soil layer

Adding another suffix:
_AF: for fuel age

adrifoster · 2021-09-27T18:15:44Z

Hey all,

I want to confirm that these ED_ variables can be deleted if we already have a FATES_ version of them. See hio_bstore_si and ih_storec_si. I believe they are the same?

fates/main/FatesHistoryInterfaceMod.F90

Line 2285 in 70a3e13

hio_bstore_si(io_si) = hio_bstore_si(io_si) + n_perm2 * store_m * g_per_kg

rgknox · 2021-09-27T18:18:49Z

yes, I think you are right

adrifoster · 2021-09-28T18:52:18Z

Hey @rgknox, @ckoven and others, is there a reason to keep the mortality variable names M1_SZPF, M2_SZPF, etc? Or can/should we rename them to reflect what they actually are? (i.e. MORTALITY_BACKGROUND, etc.).

ckoven · 2021-09-28T19:03:57Z

I have to look it up every time, so I support moving away from the numeric indexing.

jkshuman · 2021-09-28T19:34:19Z

@adrifoster agree that giving these mortality terms human readable names would be great.

adrifoster · 2021-09-28T19:43:03Z

Okay great thanks @ckoven and @jkshuman. Unless anyone else objects I'm going to give them verbose names.

rosiealice · 2021-09-29T07:42:15Z

Yes!

adrifoster · 2021-10-08T17:13:21Z

A question about some variables for @ckoven, @rgknox, @jkshuman, @rosiealice or anyone else:

This variable, which is called TRIMMING_CANOPY_SCLS and TRIMMING_UNDERSTORY_SCLS is given as units indiv/ha (code here)

call this%set_history_var(vname='TRIMMING_CANOPY_SCLS', units = 'indiv/ha',               &
    long='trimming term of canopy plants by size class', use_default='inactive',   &
    avgflag='A', vtype=site_size_r8, hlms='CLM:ALM',     &
    upfreq=1, ivar=ivar, initialize=initialize_variables, index = ih_trimming_canopy_si_scls )

It's calculated as in both cases here

hio_trimming_canopy_si_scls(io_si,scls) = hio_trimming_canopy_si_scls(io_si,scls) + &
    ccohort%n * ccohort%canopy_trim

However, we have another TRIMMING variable with units 1 :

    call this%set_history_var(vname='FATES_TRIMMING', units=1,                 &
         long='degree to which canopy expansion is limited by leaf economics (0-1)', &
         use_default='active', avgflag='A', vtype=site_r8, hlms='CLM:ALM',     &
         upfreq=1, ivar=ivar, initialize=initialize_variables,                 &
         index=ih_trimming_si)

However this is calculated without the multiplication by the cohort%n

hio_trimming_si(io_si) = hio_trimming_si(io_si) + cpatch%tallest%canopy_trim * cpatch%area * AREA_INV

I understand the units for the TRIMMING_CANOPY_SCLS because we are multiplying [1] by [indiv/ha] but I guess I just want to make sure the long name is the most accurate/descriptive. What actually are we calculating with that variable? The number of trimmed canopy plants?

ckoven · 2021-10-08T17:55:49Z

Hi @adrifoster the intention here is just to be able to weight the trim output variable by size and canopy layers, since we expect it to definitely vary strongly between canopy and understory, and possibly within the canopy by size (and really by PFT too, but this doesn't even get into that). So the idea is to then divide by nplant_scls in postprocessing to be bale to get mean values of the trim parameter. For the site-level one, it is specifically getting just the tallest cohort in each patch, and then area-weighting that by that patch area, so I guess the idea is to just get the tallest-plant trim level as a proxy for the canopy trimming variable.

In practice the TRIMMING_CANOPY_SCLS and TRIMMING_UNDERSTORY_SCLS are probably over-informative (except that they don't resolve PFTs) and the FATES_TRIMMING is probably under-informative, since what we would really want are just a single number for each PFT and each canopy layer, since those should explain the most variation in cohort-level trim. So there's probably a better way to do it, perhaps by weighting each cohort by crown area and dividing by that PFT's total crown area in canopy and understory.

adrifoster · 2021-10-08T18:30:06Z

@ckoven okay that makes sense, how should we define the long name for those?

'trimming term weighted by the plant density" ?
units: "m-2" ?

adrifoster · 2021-10-09T17:27:41Z

Another question about units:

The variables STOREN_TFRAC_CANOPY_SCPF and STOREN_TFRAC_UNDERSTORY_SCPF are set as having units kgN/ha (code here):

       call this%set_history_var(vname='STOREN_TFRAC_CANOPY_SCPF', units='kgN/ha', &
            long='storage nitrogen fraction of target,in canopy, by size-class x pft', use_default='inactive', &
            avgflag='A', vtype=site_size_pft_r8, hlms='CLM:ALM',     &
            upfreq=1, ivar=ivar, initialize=initialize_variables, index = ih_storentfrac_canopy_scpf )

       call this%set_history_var(vname='STOREN_TFRAC_UNDERSTORY_SCPF', units='kgN/ha', &
            long='storage nitrogen fraction of target,in understory, by size-class x pft', use_default='inactive', &
            avgflag='A', vtype=site_size_pft_r8, hlms='CLM:ALM',     &
            upfreq=1, ivar=ivar, initialize=initialize_variables, index = ih_storentfrac_understory_scpf )

But it seems like it should actually be dimensionless (calculated here and here)? :

                     if (ccohort%canopy_layer .eq. 1) then
                        this%hvars(ih_storentfrac_canopy_scpf)%r82d(io_si,i_scpf) = & 
                             this%hvars(ih_storentfrac_canopy_scpf)%r82d(io_si,i_scpf) + store_m/store_max * ccohort%n
                     else
                        this%hvars(ih_storentfrac_understory_scpf)%r82d(io_si,i_scpf) = & 
                             this%hvars(ih_storentfrac_understory_scpf)%r82d(io_si,i_scpf) + store_m/store_max * ccohort%n
                     end if

                  if(  hio_nplant_canopy_si_scpf(io_si,i_scpf)>nearzero ) then
                        this%hvars(ih_storentfrac_canopy_scpf)%r82d(io_si,i_scpf) = &
                             this%hvars(ih_storentfrac_canopy_scpf)%r82d(io_si,i_scpf) / &
                             hio_nplant_canopy_si_scpf(io_si,i_scpf)
                     end if
                     
                     if( hio_nplant_understory_si_scpf(io_si,i_scpf)>nearzero ) then
                        this%hvars(ih_storentfrac_understory_scpf)%r82d(io_si,i_scpf) = &
                             this%hvars(ih_storentfrac_understory_scpf)%r82d(io_si,i_scpf) / &
                             hio_nplant_understory_si_scpf(io_si,i_scpf)
                     end if

adrifoster · 2021-10-09T18:07:10Z

And more questions (nearing the end here...)

       call this%set_history_var(vname='FATES_ITERH1_SZPF', units='count/indiv/step', &
             long='water balance error iteration diagnostic 1', &
             use_default='inactive', &
             avgflag='A', vtype=site_size_pft_r8, hlms='CLM:ALM',     &
             upfreq=4, ivar=ivar, initialize=initialize_variables, index = ih_iterh1_scpf )

       call this%set_history_var(vname='FATES_ITERH2_SCPF', units='count/indiv/step', &
             long='water balance error iteration diagnostic 2', &
             use_default='inactive', &
             avgflag='A', vtype=site_size_pft_r8, hlms='CLM:ALM',     &
             upfreq=4, ivar=ivar, initialize=initialize_variables, index = ih_iterh2_scpf )

I'm not sure about what these should be converted to...

adrifoster · 2021-10-11T21:53:28Z

So @billsacks brought up the conflict that CTSM/CLM has its mass units in grams, whereas with this switch to the CF table here that we voted on at the FATES call a on September 16, FATES will be going with kilograms for our output variables.

I am working on making these updates here, and am currently in testing phase.

However, given what @billsacks and @ekluzek were pointing out, I want to confirm that we will move forward with this switch to the CF conventions table for FATES output. Please note that these changes are only for the output, and will not affect any internal variables within FATES or the HLM.

Tagging people here in the issue so we can capture discussion on this change:
@rgknox @glemieux @ckoven @dlawrenncar @billsacks @ekluzek @danicalombardozzi @wwieder @klindsay28

wwieder · 2021-10-12T14:13:51Z

Thanks for pointing out this discrepancy, @adrifoster. I know there's been some movement in CESM component models to move towards greater CF compliance. Having FATES help us move towards this goal seems appropriate. I'm not sure how much time we want to spend doing a parallel effort for analogous output from the big-leaf model, especially if FATES is eventually going to replace the older (non-CF compliant) scheme. I guess we want to avoid confusing users who may be using both FATES and big-leaf models in the short/medium term.

billsacks · 2021-10-12T15:05:51Z

@wwieder and others: to clarify, I don't think this units question is a question of CF compliance. CF gives canonical units (which are SI units), but I believe doesn't dictate what units you use to be CF compliant. To be CF compliant, you just need to explicitly specify your units in a particular format. I'm not positive about this, but that is the understanding of @ekluzek and myself.

wwieder · 2021-10-12T15:26:22Z

Thanks for clarifying @billsacks. I am not a CF expert, but from the FATES conversation I took away that units of kg, m^2, and s were preferred for mass, area, and time, respectively. @ckoven may have a more informed perspective here.

Separately, @jenniferholm, I'd assume this is potentially an issue for E3SM too?

ckoven · 2021-10-12T15:41:01Z

Right @billsacks, CF specifies "canonical units" which are also the units that must be used when reporting to mips such as CMIP, trendy, etc. Certainly it is in principle fine to continue doing what CESM has always been doing, which is to use units that are convertible to but different than the canonical ones in the model history fields, and then just convert the units during the CMOR-ization process. But since (a) standardization of units is a headache, particularly for mips with less infrastructural support than CMIP, and (b) FATES hasn't been keeping to any specific convention, but we want to do so before everyone's scripts get baked in, we decided that we'd standardize to the canonical CF units, which has the benefit that it removes the unit-conversion step when we participate in mips.

billsacks · 2021-10-12T15:53:22Z

Thanks for the clarification @ckoven . That makes sense - so this is more about removing this CMOR-ization step (or similar for other standardization for inter-model comparisons) rather than CMIP compliance per se.

billsacks · 2021-10-12T16:52:22Z

For cross-reference with a relevant CTSM issue, I just remembered this one that @ekluzek started last year: ESCOMP/CTSM#973

This was referenced Apr 28, 2020

Updates to fire history variables #640

Merged

Move all the patch-level hist vars to site-level #642

Merged

ekluzek mentioned this issue Sep 30, 2020

CMORize names and units in history fields and make units UDUNITS compliant ESCOMP/CTSM#973

Open

adrifoster mentioned this issue Sep 30, 2021

FATES history variables with default active #783

Open

adrifoster mentioned this issue Nov 2, 2021

Overhaul of FATES history variable outputs #802

Merged

5 tasks

glemieux linked a pull request Nov 2, 2021 that will close this issue

Overhaul of FATES history variable outputs #802

Merged

5 tasks

adrifoster mentioned this issue Nov 5, 2021

Updates to facilitate FATES history variable overhaul ESCOMP/CTSM#1542

Merged

glemieux closed this as completed in #802 Nov 29, 2021

rgknox mentioned this issue Oct 24, 2022

Dynamics history diagnostics could be split into more than 1 call #927

Open

History Refactor Wish-list #630

History Refactor Wish-list #630

Comments

rgknox commented Apr 9, 2020

rgknox commented Apr 22, 2020 • edited Loading

ckoven commented Apr 24, 2020

glemieux commented May 13, 2020

rgknox commented Sep 18, 2020

rgknox commented Sep 29, 2020 • edited Loading

rgknox commented Oct 13, 2020

jkshuman commented Oct 13, 2020

ckoven commented Feb 5, 2021

adrifoster commented Sep 9, 2021

adrifoster commented Sep 9, 2021 • edited Loading

ckoven commented Sep 9, 2021

ekluzek commented Sep 9, 2021

rgknox commented Sep 10, 2021 • edited Loading

rgknox commented Sep 10, 2021 • edited Loading

adrifoster commented Sep 10, 2021

adrifoster commented Sep 10, 2021

ekluzek commented Sep 10, 2021

rgknox commented Sep 10, 2021

dlawrenncar commented Sep 10, 2021 via email

walkeranthonyp commented Sep 13, 2021 • edited Loading

ckoven commented Sep 13, 2021

rgknox commented Sep 13, 2021

walkeranthonyp commented Sep 14, 2021

adrifoster commented Sep 14, 2021 • edited Loading

ckoven commented Sep 14, 2021

ckoven commented Sep 14, 2021

ckoven commented Sep 14, 2021

mpaiao commented Sep 14, 2021

ckoven commented Sep 16, 2021

adrifoster commented Sep 27, 2021

adrifoster commented Sep 27, 2021

rgknox commented Sep 27, 2021

adrifoster commented Sep 28, 2021 • edited Loading

ckoven commented Sep 28, 2021

jkshuman commented Sep 28, 2021

adrifoster commented Sep 28, 2021

rosiealice commented Sep 29, 2021

adrifoster commented Oct 8, 2021 • edited Loading

ckoven commented Oct 8, 2021

adrifoster commented Oct 8, 2021

adrifoster commented Oct 9, 2021

adrifoster commented Oct 9, 2021

adrifoster commented Oct 11, 2021

wwieder commented Oct 12, 2021

billsacks commented Oct 12, 2021

wwieder commented Oct 12, 2021

ckoven commented Oct 12, 2021

billsacks commented Oct 12, 2021

billsacks commented Oct 12, 2021

rgknox commented Apr 22, 2020 •

edited

Loading

rgknox commented Sep 29, 2020 •

edited

Loading

adrifoster commented Sep 9, 2021 •

edited

Loading

rgknox commented Sep 10, 2021 •

edited

Loading

rgknox commented Sep 10, 2021 •

edited

Loading

walkeranthonyp commented Sep 13, 2021 •

edited

Loading

adrifoster commented Sep 14, 2021 •

edited

Loading

adrifoster commented Sep 28, 2021 •

edited

Loading

adrifoster commented Oct 8, 2021 •

edited

Loading