Remove accumulation fields' PERIOD from the restart file #1802

billsacks · 2022-07-11T23:31:01Z

Description of changes

This doesn't need to be on the restart file, and having it read from the
restart file causes wrong behavior when changing the model time step.

See #1789 for details

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):
Resolves #1789

Are answers expected to change (and if so in what way)? YES - for any case where the time step differs from the one originally used to create the initial conditions file. This typically means changing answers for any case with a time step other than 1/2 hour.

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any: None yet; about to run test suite

This doesn't need to be on the restart file, and having it read from the restart file causes wrong behavior when changing the model time step. See ESCOMP#1789 for details Resolves ESCOMP#1789

billsacks · 2022-07-13T04:10:43Z

I have spent more time looking at the relevant code, with a particular focus on whether new problems / inconsistencies will arise from my proposed fix of removing the accumulation fields' period from the restart file (e.g., will there be problems because now we're using a newly-initialized period value, but still reading each accumulation field's nsteps from the restart file?). My overall conclusion is that there could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. More details below:

In extract_accum_field_timeavg, there is:

    if (mod(nstep,this%period) == 0) then

In a branch run, where nstep continues to accumulate, that is going to be weird when changing time steps, but I don't think it will be hugely problematic: I think things will work themselves out after one averaging period. The weirdest thing that might happen is that the averaging periods may no longer be aligned with day boundaries. I guess this argues for not changing the time step in a branch run, but using a hybrid or startup run instead.

There is similar potential weirdness in update_accum_field_timeavg. That includes some additional potential weirdness due to the use of this%nsteps:

    if (mod(nstep,this%period) == 0) then
       do k = begi,endi
          if (this%active(k)) then
             this%val(k,level) = this%val(k,level) / this%nsteps(k,level)
          end if
       end do
    end if

But it looks like this%nsteps should be reset to 0 (earlier in that subroutine) when nstep is 1, so as long as we're in a startup or hybrid case, I think this will be okay. If we're in a branch case (where nstep has continued from before), then there will be weirdness until we finish out the current accumulation period.

There is also weirdness of a different kind for runmean accumulation (in update_accum_field_runmean):

    do k = begi,endi
       if (this%active(k)) then
          kf = k - begi + 1
          this%nsteps(k,level) = this%nsteps(k,level) + 1
          ! Cap nsteps at 'period' - partly to avoid overflow, but also because it doesn't
          ! help us to accumulate nsteps beyond a value of 'period' (because nsteps is
          ! just used to determine accper, and accper needs to be capped at 'period'). A
          ! side-benefit of this capping of nsteps is that accper (the accumulation
          ! period) is always equal to nsteps.
          this%nsteps(k,level) = min(this%nsteps(k,level), this%period)
          accper = this%nsteps(k,level)
          this%val(k,level) = &
               ((accper-1)*this%val(k,level) + field(kf)) / accper
       end if
    end do

I think this will be okay when going to longer time steps (i.e., decreasing this%period), but when moving to shorter time steps (increasing this%period), the first part of the run will see too-small this%nsteps. This isn't a huge problem, but until we have gotten up to this%nsteps = this%period, each new time step's value will have a bigger influence on the running mean than it really should. It looks like this problem will exist in startup and hybrid runs as well as branch runs.

We could probably fix this problem by allowing this%nsteps to get really big rather than capping it at this%period (and then capping the local accper instead), but I'm not sure it's worth making that change.

%nsteps will also be weird at first for a runaccum field. However, it looks like %nsteps is irrelevant for a runaccum field (i.e., it's never used for anything), so that shouldn't be an issue.

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in #1789 for more details, and see #1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolves #1789 (A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep)

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolved conflicts: doc/ChangeLog doc/ChangeSum python/conda_env_ctsm_py.txt python/ctsm/modify_fsurdat/fsurdat_modifier.py python/ctsm/modify_fsurdat/modify_fsurdat.py

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolved conflict: python/ctsm/modify_fsurdat/fsurdat_modifier.py

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step.

Remove accumulation fields' PERIOD from the restart file

666ddb5

This doesn't need to be on the restart file, and having it read from the restart file causes wrong behavior when changing the model time step. See ESCOMP#1789 for details Resolves ESCOMP#1789

billsacks self-assigned this Jul 11, 2022

cacraigucar mentioned this pull request Jul 12, 2022

Bring in CTSM bug fix tag when it is ready ESCOMP/CAM#616

Closed

billsacks merged commit 666ddb5 into ESCOMP:master Jul 13, 2022

billsacks deleted the fix_accum_for_changing_timestep branch July 13, 2022 05:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove accumulation fields' PERIOD from the restart file #1802

Remove accumulation fields' PERIOD from the restart file #1802

billsacks commented Jul 11, 2022

billsacks commented Jul 13, 2022

Remove accumulation fields' PERIOD from the restart file #1802

Remove accumulation fields' PERIOD from the restart file #1802

Conversation

billsacks commented Jul 11, 2022

Description of changes

Specific notes

billsacks commented Jul 13, 2022