Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove accumulation fields' PERIOD from the restart file #1802

Merged
merged 1 commit into from
Jul 13, 2022

Conversation

billsacks
Copy link
Member

Description of changes

This doesn't need to be on the restart file, and having it read from the
restart file causes wrong behavior when changing the model time step.

See #1789 for details

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):
Resolves #1789

Are answers expected to change (and if so in what way)? YES - for any case where the time step differs from the one originally used to create the initial conditions file. This typically means changing answers for any case with a time step other than 1/2 hour.

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any: None yet; about to run test suite

This doesn't need to be on the restart file, and having it read from the
restart file causes wrong behavior when changing the model time step.

See ESCOMP#1789 for details

Resolves ESCOMP#1789
@billsacks
Copy link
Member Author

I have spent more time looking at the relevant code, with a particular focus on whether new problems / inconsistencies will arise from my proposed fix of removing the accumulation fields' period from the restart file (e.g., will there be problems because now we're using a newly-initialized period value, but still reading each accumulation field's nsteps from the restart file?). My overall conclusion is that there could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. More details below:

In extract_accum_field_timeavg, there is:

    if (mod(nstep,this%period) == 0) then

In a branch run, where nstep continues to accumulate, that is going to be weird when changing time steps, but I don't think it will be hugely problematic: I think things will work themselves out after one averaging period. The weirdest thing that might happen is that the averaging periods may no longer be aligned with day boundaries. I guess this argues for not changing the time step in a branch run, but using a hybrid or startup run instead.

There is similar potential weirdness in update_accum_field_timeavg. That includes some additional potential weirdness due to the use of this%nsteps:

    if (mod(nstep,this%period) == 0) then
       do k = begi,endi
          if (this%active(k)) then
             this%val(k,level) = this%val(k,level) / this%nsteps(k,level)
          end if
       end do
    end if

But it looks like this%nsteps should be reset to 0 (earlier in that subroutine) when nstep is 1, so as long as we're in a startup or hybrid case, I think this will be okay. If we're in a branch case (where nstep has continued from before), then there will be weirdness until we finish out the current accumulation period.

There is also weirdness of a different kind for runmean accumulation (in update_accum_field_runmean):

    do k = begi,endi
       if (this%active(k)) then
          kf = k - begi + 1
          this%nsteps(k,level) = this%nsteps(k,level) + 1
          ! Cap nsteps at 'period' - partly to avoid overflow, but also because it doesn't
          ! help us to accumulate nsteps beyond a value of 'period' (because nsteps is
          ! just used to determine accper, and accper needs to be capped at 'period'). A
          ! side-benefit of this capping of nsteps is that accper (the accumulation
          ! period) is always equal to nsteps.
          this%nsteps(k,level) = min(this%nsteps(k,level), this%period)
          accper = this%nsteps(k,level)
          this%val(k,level) = &
               ((accper-1)*this%val(k,level) + field(kf)) / accper
       end if
    end do

I think this will be okay when going to longer time steps (i.e., decreasing this%period), but when moving to shorter time steps (increasing this%period), the first part of the run will see too-small this%nsteps. This isn't a huge problem, but until we have gotten up to this%nsteps = this%period, each new time step's value will have a bigger influence on the running mean than it really should. It looks like this problem will exist in startup and hybrid runs as well as branch runs.

  • We could probably fix this problem by allowing this%nsteps to get really big rather than capping it at this%period (and then capping the local accper instead), but I'm not sure it's worth making that change.

%nsteps will also be weird at first for a runaccum field. However, it looks like %nsteps is irrelevant for a runaccum field (i.e., it's never used for anything), so that shouldn't be an issue.

@billsacks billsacks merged commit 666ddb5 into ESCOMP:master Jul 13, 2022
billsacks added a commit that referenced this pull request Jul 13, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in #1789 for
more details, and see
#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.

Resolves #1789 (A bug in calculating accumulated fields
(24/240 hours averaged) when using a smaller timestep)
@billsacks billsacks deleted the fix_accum_for_changing_timestep branch July 13, 2022 05:02
slevis-lmwg added a commit to slevis-lmwg/ctsm that referenced this pull request Jul 14, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in ESCOMP#1789 for
more details, and see
ESCOMP#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.

Resolved conflicts:
doc/ChangeLog
doc/ChangeSum
python/conda_env_ctsm_py.txt
python/ctsm/modify_fsurdat/fsurdat_modifier.py
python/ctsm/modify_fsurdat/modify_fsurdat.py
slevis-lmwg added a commit to slevis-lmwg/ctsm that referenced this pull request Jul 14, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in ESCOMP#1789 for
more details, and see
ESCOMP#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.

Resolved conflict:
python/ctsm/modify_fsurdat/fsurdat_modifier.py
ekluzek added a commit to ekluzek/CTSM that referenced this pull request Jul 14, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in ESCOMP#1789 for
more details, and see
ESCOMP#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.
ekluzek added a commit to ekluzek/CTSM that referenced this pull request Jul 14, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in ESCOMP#1789 for
more details, and see
ESCOMP#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.
glemieux added a commit to glemieux/ctsm that referenced this pull request Jul 15, 2022
Fix accumulation variables when changing model time step

Accumulation variables (e.g., 1-day or 10-day averages) were writing and
reading their accumulation period (expressed in time steps) to the
restart file. This caused incorrect behavior when changing the model
time step relative to what was used to create the initial conditions
file (typically a 30-minute time step). So, for example, if you are
using a 15-minute time step with an initial conditions file that
originated from a run with a 30-minute time step (at some point in its
history), then an average that was supposed to be 10-day instead becomes
5-day; an average that was supposed to be 1-day becomes 12-hour, etc.
(The issue is that the number of time steps in the averaging period was
staying fixed rather than the actual amount of time staying fixed.)

For our out-of-the-box initial conditions files, this only impacts runs
that use something other than a 30-minute time step. Typically this
situation arises in configurations with an active atmospheric model that
is running at higher resolution than approximately 1 degree. It appears
that the biggest impacts are on VOC emissions and in BGC runs; we expect
the impact to be small (but still non-zero) in prescribed phenology (SP)
runs that don't use VOC emissions.

This tag fixes this issue by no longer writing or reading accumulation
variables' PERIOD to / from the restart file: this isn't actually needed
on the restart file.

See some discussion in ESCOMP#1789 for
more details, and see
ESCOMP#1802 (comment) for
some discussion of outstanding weirdness that can result for
accumulation variables when changing the model time step. The summary of
that comment is: There could be some weirdness at the start of a run,
but at least for a startup or hybrid run, that weirdness should work
itself out within about the first averaging period. A branch or restart
run could have some longer-term potential weirdness, so for now I think
we should recommend that people NOT change the time step on a branch or
restart run. With (significant?) additional work, we could probably
avoid this additional weirdness, but my feeling is that it isn't worth
the effort right now. In any case, I feel like my proposed fix will
bring things much closer to being correct than they currently are when
changing the time step.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep
1 participant