Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support additive forcing perturbations #30

Open
aekiss opened this issue Sep 15, 2019 · 70 comments
Open

support additive forcing perturbations #30

aekiss opened this issue Sep 15, 2019 · 70 comments
Assignees

Comments

@aekiss
Copy link
Contributor

aekiss commented Sep 15, 2019

Branching off from #25 as a separate issue.

Implement this via a new offset_filename key, so that we can have both additive and multiplicative perturbations to any forcing field, which would be a useful extension to our current scaling-only perturbation capability discussed in the wiki.

An example forcing.json entry:

    {
      "filename": "/g/data/qv56/replicas/input4MIPs/CMIP6/OMIP/MRI/MRI-JRA55-do-1-4-0/land/day/friver/gr/v20190429/friver/friver_input4MIPs_atmosphericState_OMIP_MRI-JRA55-do-1-4-0_gr_{{year}}0101-{{year}}1231.nc",
      "fieldname": "friver",
      "offset_filename": "river_offset_{{year}}0101-{{year}}1231.nc",
      "cname": "runof_ai"
    },
@rmholmes
Copy link

@aekiss @aidanheerdegen I am resurrecting this issue as there are a few experiments that various people would like to run that would be made significantly easier with such an addition.

As an example, Qian is going to be running an experiment with a linearly increasing air temperature modelled on CMIP6 projections, added to the RYF9091 JRA55 forcing. I.e. she would like:

Tair(x,y,z,t) = TairRYF(x,y,z,t) + Tair_CMIP_pattern(x,y) * Tair_CMIP_timeseries(t)

where TairRYF is the RYF9091 field, Tair_CMIP_pattern(x,y) is a spatial pattern of anomalies and Tair_CMIP_timeseries(t) is a time series of the amplitude of that pattern (a linear function in Qian's case).

Such an anomaly is difficult to do in the current system as it requires making a separate copy of the JRA55 Tair files for every year of the simulation. If you're doing a 100-year experiment, then you either have to create 100 copies of the JRA55 data (and they are big files) or you have to manually change the files every year.

I also have some ideas for experiments using simple time-dependent, spatially-independent (Tair_CMIP_pattern = 1) forcing field anomalies that would benefit from a similar approach.

So I wanted to start a discussion on this. I see two possible (high-level) approaches to make this easier:

  1. Implementing an additive scaling offset as described in this issue. However, it would need to be generalized to have separate inputs for the spatial pattern (Tair_CMIP_pattern) and the time series (Tair_CMIP_timeseries) or you're not really doing any better than copying the full forcing files. This may be difficult.
  2. Implementing a post-processing script that is called at the end of every year (for the 1/10th-degree it would only need to be called every 4 3-month-submits when the RYF9091 forcing fields cycle over) that updates the forcing files before the next year is simulated.

Both of these approaches have issues. I am more attracted to no. 2 above because I think it is more general (e.g. it permits more complicated changes, such as keeping the relative humidity constant as the temperature anomalies are applied). I would be interested in everyone's thoughts.

@AndyHoggANU
Copy link

Hi @rmholmes - sorry we didn't get a chance to discuss this in today's MOM meeting. I think option 2 is a nice idea, as it's more flexible and more able to be deployed for a wider array of uses. I guess the key issue is whether payu can get enough information out of the run to supply the script making the forcing files. If the information required was only time (i.e. other functions of x,y were hardwired into the script) then I think we could probably do this.

The key is that, with a small addition to payu, we could do just about anything. @aidanheerdegen - I guess we are relying on you here to have a think and let us know if a payu addition is feasible for this. If yes, maybe a quick online discussion on Monday could be in order?

@rmholmes
Copy link

I think a zoom chat about this would be a good idea, perhaps after @aidanheerdegen has had a chance to think about what might be possible.

One more potential spannar in the works here that is worth considering: After chatting with Qian and Matt yesterday I think Qian's experiment would also benefit from maintaining the atmospheric relative humidity constant as the air temperature anomalies are applied. This is a bit more difficult to fit into the framework as the specific humidity perturbation is then an (x,y,t) perturbation that is not obviously separable into (x,y) and (t). I would want to do the same for the other experiment/s I have in mind.

However, perhaps a completely different way to get around this problem was if we could feed access-om2 relative humidity rather than specific humidity as an input field?

@aidanheerdegen
Copy link
Contributor

@nichannah has assigned himself to this, so I wouldn't suggest spending any time on this until he can let us know the progress on this. He might already have a solution, or perhaps some well thought through plans for implementation.

Nic?

@AndyHoggANU
Copy link

I think Nic assigned himself this 18 months ago, then completed the work and the issue was closed -- Ryan then reopened it with his new ideas... But I am of course happy if Nic wants to take this one on?

@aidanheerdegen
Copy link
Contributor

Oh right, my misunderstanding. Soz.

There are no tagged PRs or commits which would make it easier to track related changes to see what has already been done and how new functionality might be incorporated.

@nichannah
Copy link
Contributor

nichannah commented Mar 1, 2021

Yes, I'm happy to do it at the libaccessom2 level if that is easier that via Payu. Presently we have the ability to do scaling by either a file with a time dimension or a single constant. In the simplest case we could replicate this but use addition instead of multiplication - using a syntax like @aekiss proposed above. Would that be suitable? On the other hand it might be worth thinking about something more general such as a function of the forcing variables from two files. e.g

    {
      "filename": "/g/data/ua8/JRA55-do/RYF/v1-3/RYF.rsds.1990_1991.nc",
      "pertubation_filename": "../test_data/scaling.RYF.rsds.1990_1991.nc",
      "pertubation_function": "rsds = rsds_A + rsds_B"
      "fieldname": "rsds",
      "cname": "swfld_ai"
    },

This could help: https://github.com/jacobwilliams/fortran_function_parser

@rmholmes
Copy link

rmholmes commented Mar 1, 2021

Thanks @nichannah.

So just to confirm - when you say "scaling by either a file with a time dimension or a single constant" - I think you mean that the scaling file (currently multiplicative - but easy to implement additive) is a single file that contains both spatial and temporal information?

Is it possible to separate the spatial and temporal scaling information? This would really help with applying the perturbations that we commonly want that have a fixed spatial pattern (possibly even spatially-constant) but a long-term temporal variation (e.g. a linear trend in Qian's case, or a sinusoidal oscillation with an interannual to decadal period, as I would like to implement). Ideally we want this temporal variation to be smooth across each JRA55 input time step - meaning that to do it properly without a system such as we are discussing here requires copying the JRA55 forcing files at full temporal resolution for the whole time period of the simulation (50+ years in Qian's case).

If such were possible then it would cover the cases I am thinking of, with the below significant caveat.

The caveat - relative humidity: In both Qian and my experiments it would be best to keep the relative humidity constant while the air temperature is changed. To do this while specific humidity is the input field requires changing the specific humidity through a complicated function of unperturbed and perturbed air temperature, unperturbed specific humidity and air pressure (see this script that Claire put together). This caveat could be easily solved if we could change the coupling system to allow relative humidity as an input field rather than specific humidity. E.g. in atmosphere/forcing.json we would want to replace,

"filename": "INPUT/RYF.q_10.1990_1991.nc",
"fieldname": "huss_10m",
"cname": "qair_ai"

with

"filename": "INPUT/RYF.r_10.1990_1991.nc",
"fieldname": "rhuss_10m",
"cname": "rair_ai"

where rair is the relative humidity calculated just once from the other JRA55 forcing fields. Is this possible?

@russfiedler
Copy link
Contributor

I've had a bit more of a think about using relative humidity as a forcing field. Since this is about physics I'm now of the opinion that it should be treated in CICE and can be easily done in the subroutinefrom_atm. We can flag if rhair_i (or whatever we call it in CICE) is supplied and then after all the fields are read

if(do_rh2q) then
where(tair0 /= 0.0) qair0 = rh2q(tair0,rhair0,press0)
endif

where rh2q is a simple elemental function. I think I'd put it in cpl_forcing_handler.F90.
We don't need to do anything else to the code as the relative humidity has done its job and is never used again.

How's that sound?

@rmholmes
Copy link

rmholmes commented Mar 4, 2021

That sounds pretty good to me @russfiedler. However, I can't say that I know much about CICE.

We would then just need to create new RYF input files that contained relative humidity (rhair0) rather than specific humidity, do the substitutions in atmosphere/forcing.json that I indicated above and activate this flag so that CICE knows to take rhair0 as an input rather than qair0 as an input.

I would have expected that a relative - specific humidity conversion would already occur somewhere in the code. Again, I know little about CICE but I did find that in source/ice_forcing.F90 there is a relative humidity variable already rh, in the subroutine oned_data. I'm not sure if that is relevant?

I can volunteer to do some testing in ACCESS-OM2-1 once these changes are made.

@russfiedler
Copy link
Contributor

@rmholmes We're using the gfdl form for fluxes so I'd use probably use the lookup tables for the saturated vapor pressure in sat_vapor_pres_mod.F90. We should have q=0.622*(rh/100)*e_sat(T)/(p-0.378*(rh/100)*e_sat(T)) if rh is supplied as a percentage.

@aekiss
Copy link
Contributor Author

aekiss commented Mar 4, 2021

@rmholmes - Nic was referring to this: #31 - ie support for spatially uniform and temporally constant scalings specified by a scale_constant as an alternative to the spatiotemporally varying scalings specified by scaling_filename and described here.

@nichannah fortran_function_parser looks interesting but it appears to have been active for just a few weeks weeks 4 years ago and then abandoned, so if we used that we may need to also maintain it ourselves (but first find out why it was abandoned).

Here's a straw-man: If we are happy to support just scaling and offsets of any input field f of the form
scaling(x,y,t)*f + offset(x,y,t)
we could have entries like this inforcing.json:

    {
      "filename": "/g/data/ua8/JRA55-do/RYF/v1-3/RYF.u_10.1984_1985.nc",
      "scaling_filename": "/whatever/RYF.u_10.1984_1985_scale.nc",
      "offset_filename": "/whatever/RYF.u_10.1984_1985_offset.nc",
      "temporal_scaling_filename": "/whatever/RYF.u_10.1984_1985_temporal_scaling.nc",
      "temporal_offset_filename": "/whatever/RYF.u_10.1984_1985_temporal_offset.nc",
      "fieldname": "uas_10m",
      "cname": "uwnd_ai"
    },

where the 4 scaling and offset entries are all optional. If present, they would all contain a field matching fieldname. Any or none of them could contain a time axis. The temporal_* entries contain no spatial axes, but the others do.

scaling in the formula above would be given by scaling_filename * temporal_scaling_filename; if either is absent, it is replaced by 1; if neither has a time axis then the scaling will be unchanging in time. This way we can specify arbitrary spatiotemporal scaling, or separable spatial*temporal, or spatial-only, or temporal-only.

Similarly offset in the formula above would be given by offset_filename * temporal_offset_filename; if either is absent, it is replaced by 0; if neither has a time axis then the offset will be unchanging in time. This way we can specify arbitrary spatiotemporal offsets, or separable spatial*temporal, or spatial-only, or temporal-only.

If scaling and offset files are both present, they are both applied as in the formula.

Would that cover all the use cases we can think of? (other then the humidity issue, which it looks like we can deal with via a code change in CICE)

@nichannah does this seem straightforward to implement?

@russfiedler this proposal would make scale_constant redundant, as this could be achieved by temporal_scaling_filename with no time axis.

@aekiss
Copy link
Contributor Author

aekiss commented Mar 4, 2021

Also, where time axes are present, the scalings/offsets would only be applied at times covered by those axes (which is how it works currently, allowing a brief perturbation to damp down an individual storm).

@aekiss
Copy link
Contributor Author

aekiss commented Mar 4, 2021

Further possible extensions (probably over the top):

  1. support complex values in any perturbation field, allowing concise representation of propagating perturbations (only the real part of the product scaling_filename * temporal_scaling_filename or offset_filename * temporal_offset_filename would be used in scaling(x,y,t)*f + offset(x,y,t))
  2. allow any of the perturbations to be a list of files rather than just one, and combine them pairwise. This would allow (say) an EOF reconstruction. This would make 1. redundant, since quadrature components could be used in place of complex values.

@aekiss
Copy link
Contributor Author

aekiss commented Mar 10, 2021

Following a discussion on Tuesday, here's an updated straw-man proposal replacing the above. It's long and detailed because I've tried to be explicit about the semantics and corner cases, which in most cases is simply spelling out what you'd expect. Apologies for the boring read!

It's unclear to me how much work would be needed to implement something like this, but if we decide to implement only a subset it would be good to bear this bigger picture in mind to allow for a possibility such as this in the future.

Objective

To support the scaling and offsetting of any input field f of the form

scaling(x,y,t)*f + offset(x,y,t)

and allow efficient representation when scaling and/or offset have:

  1. arbitrary spatiotemporal variation, or
  2. a sum of (arbitrary spatial patterns multiplied by arbitrary temporal variations), e.g. an EOF reconstruction, or
  3. arbitrary spatial variation (temporally constant), or
  4. arbitrary temporal variation (spatially constant), or
  5. constant in both space and time

where the temporal variation may be either

  • referenced to the experiment calendar, allowing progressive changes spanning multiple RYF years (e.g. a ramp over several RYF years), or
  • referenced to the forcing calendar, and therefore repeating in each RYF year (e.g. to damp out a storm)

Notation

forcing.json would have a format like this:

{
  "description": "JRA55-do V1.3 RYF 1984-85 forcing",
  "inputs": [
    {
      "filename": "/g/data/ua8/JRA55-do/RYF/v1-3/RYF.u_10.1984_1985.nc",
      "fieldname": "uas_10m",
      "cname": "uwnd_ai",
      "scaling": [
          {"spatial": "/path/to/nc/file", "temporal": "/path/to/nc/file", "calendar": "experiment"},
          {"spatiotemporal": "/path/to/nc/file", "calendar": "forcing"}, 
          ...
          ]  
      "offset": [
          {"spatial": "/path/to/nc/file", "calendar": "forcing"},
          {"temporal": "/path/to/nc/file", "calendar": "experiment"}, 
          {"spatial": "/path/to/nc/file", "temporal": "/path/to/nc/file"},
          ... 
          ]
    },
    ...
  ]
}

Semantics

  • scaling and offset are optional arrays of any length. Their lengths may differ. Either, both, or neither may be present; missing scaling is treated as 1 in the formula above; missing offset is treated as 0.
  • Elements in scaling and offset arrays are dictionaries, each dictionary defining a term. These terms are summed to form scaling(x,y,t) and offset(x,y,t) (respectively) in the formula above. Recognized keywords are:
    • spatiotemporal: a string, the path to an .nc file containing a field (whose name matches fieldname) with both spatial and time axes, which is the spatiotemporal scaling or offset data
    • spatial: a string, the path to an .nc file containing a field (whose name matches fieldname) with spatial axes only, which is the spatial part of the scaling or offset data
    • temporal: a string, the path to an .nc file containing a field (whose name matches fieldname) with time axis only, which is the temporal part of the scaling or offset data
    • constant: a number, defining a spatially and temporally constant value
    • calendar: a string, either "experiment" or "forcing", specifying whether time is referenced to the experiment or forcing calendar for this component. Default is "forcing"
    • comment: an arbitrary string, not used by libaccessom2 but intended to be used to document what this element of the array represents.
  • At least one of spatiotemporal, spatial, temporal and constant must be present in each dictionary.
    • If spatiotemporal is present, spatial, temporal and constant must be absent. This defines a spatiotemporally arbitrary scaling or offset (case 1 above).
    • If constant is present, spatiotemporal, spatial and temporal must be absent. This defines a spatially and temporally constant scaling or offset (case 5 above).
    • Either or both of spatial and temporal may be present if spatiotemporal and constant are absent. If temporal is absent, then spatial defines a spatial pattern that's present at all times (case 3 above). If spatial is absent, then temporal defines the time dependence of a spatially-uniform pattern (case 4 above). If both spatial and temporal are present, they are multiplied together to define a spatiotemporal field which is a component of the sum in case 2 above.
  • Spatial axes in the spatiotemporal and spatial .nc files must be identical in name, dimension, and order to those of fieldname in filename. The coordinate values are not used or checked.
  • The time axis in the spatiotemporal and temporal .nc files must have the same name as that for fieldname in filename, but it need not have the same values.
  • If a component contains a time axis (i.e. if spatiotemporal or temporal are present) the component is omitted from the sum at all forcing time points except
    • when the forcing time with the year replaced by the experiment year coincides with a time in the .nc file's time axis, if "calendar": "experiment" - so in this case the time axis typically has the same resolution as the forcing but has dates that may cover any or all experiment year(s)
    • when the forcing time coincides with a time in the time axis, if "calendar": "forcing" - so in this case the time axis typically has the same resolution and year range as the forcing (but may differ in duration)
  • Any of the .nc files may contain complex data, but only the real part of the sum of the components is used in scaling(x,y,t) and offset(x,y,t).
  • Any {{year}} and {{year+1}} components in file paths are replaced by year and year+1 according to calendar for that component.

Comments

  • I'm now proposing using arrays of dictionaries, since this avoids the possibility of having differing length arrays of spatial and temporal components.
  • The 5 cases are now explicit in the forcing.json notation via the presence/absence of spatiotemporal, spatial, temporal and constant, rather than implicit in the presence or otherwise of a time axis in the .nc files. This seems much less error-prone and also provides better documentation of the configuration setup. However there is some redundancy in these (e.g. spatial-only covers the constant case), so perhaps it would be better to streamline the options for ease of implementation?
  • This forcing.json notation is a bit more general than is needed to cover the objectives listed above - e.g. each component has its own calendar keyword. This allows e.g. damping out a storm in each RYF year while also applying an interannual perturbation.
  • Although this forcing.json notation is not backward-compatible with the previous one, it is more general, and the use of dictionaries makes it extendable to cover other possibilities in future by implementing new keywords.
  • This will only support perturbations that are a linear function of the original field f at each grid point and timestep. A more general (i.e. nonlinear) function of f would require something else, e.g. a function parser as Nic suggested.

@rmholmes
Copy link

Thanks @aekiss for the comprehensive plan. It looks good to me, and I like the proposed dictionary format for forcing.json.

One question on point 6 of semantics: What do you mean in the "calendar": "experiment" case by "the time axis typically has the same resolution as the model"? Does this mean for the calendar we have to define the time axis in the .nc file at every model time step? I would have thought that we only need to define these for each forcing time step?

@aekiss
Copy link
Contributor Author

aekiss commented Mar 10, 2021

Good catch @rmholmes, that also occurred to me after posting. I agree it would be much better to have it on the forcing timestep in the "calendar": "experiment" case, otherwise it won't work if the model timestep is changed. The difficulty is that the model and forcing calendars can differ by an arbitrary number of years. I'll edit the post above to try to fix it.

@aekiss
Copy link
Contributor Author

aekiss commented Mar 10, 2021

ok how does that look now @rmholmes ?

@rmholmes
Copy link

rmholmes commented Mar 11, 2021

Thanks @aekiss - looks good.

Is there any chance that leap years/days will cause problems with "calendar": "experiment"? I.e., the time vector we use in the "temporal" .nc file may have to be defined for years/days that match the RYF year but actually don't exist in a real calendar? I suspect this is not an issue as things work fine now with the current experiment time. However, it's probably something that we need to be aware of when creating the forcing perturbation files.

@aekiss
Copy link
Contributor Author

aekiss commented Mar 11, 2021

Hmm, good question @rmholmes.

As I understand it, the forcing and experiment calendars can only differ in the year no - see below (in fact I've assumed this in my description of the "calendar": "experiment" case), so if the forcing dataset has no leap years (as is the case with the standard RYF years), neither will the experiment calendar.
For example, despite being a leap year in a normal calendar, 1952 has no 29 Feb in /g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091/output204/ocean/ocean_daily.nc. So the RYF runs effectively use a noleap calendar.

How do leap years work for your repeat-decades Ryan?

@nichannah if you can clarify or correct me, please do!

@mauricehuguenin
Copy link

mauricehuguenin commented Mar 11, 2021

@aekiss Our repeat decade forcing (RDF) is 1962-1971. Both 1964 and 1968 are leap years.
We only output monthly and annual data, so I do not know if the experiment calendar in our RDF is running with a leap year calendar as it should. Is there a way to check that?

@rmholmes
Copy link

Thanks @mauricehuguenin. I suspect the RDF experiment time is like the RYF one - the leap years appear when they appear in the forcing (1964 and 1968), regardless of the experiment year. When you run the IAF off the end I think you do it like an ensemble OMIP experiment with a fresh restart.

@mauricehuguenin
Copy link

mauricehuguenin commented Mar 11, 2021

Yes, that is what I do Ryan. I reset the experiment date back to 1972 when I branch off my IAF run over 1972-2017 from the control.

I just checked in ocean_solo.res and can confirm that our RDF experiment runs with a Gregorian calendar (that has leap years).

@aekiss
Copy link
Contributor Author

aekiss commented Mar 11, 2021

So does it look like this? If so, it's not a Gregorian calendar...

forcing 62 63 64 65 66 67 68 69 70 71 62 63 64 65 66 67 68 69 70 71
model 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81

bold are correct leap years
bold italic are model leap years that should be non-leap years in the Gregorian calendar
italic are model non-leap years that should be leap years in the Gregorian calendar

@mauricehuguenin
Copy link

How would I be able to check this? The ocean_solo.res file shows
3 (Calendar: no_calendar=0, thirty_day_months=1, julian=2, gregorian=3, noleap=4)

@aekiss
Copy link
Contributor Author

aekiss commented Mar 12, 2021

grep cur_.*_date output*/atmosphere/log/matmxx.pe00000.log | grep -- -02-29T00
will show you all the forcing and model leap years

@mauricehuguenin
Copy link

mauricehuguenin commented Mar 12, 2021

Thanks, unfortunately the grep command did not give me an output but the log files show this:

{ "cur_exp-datetime" :  "1972-02-28T21:00:00" }
{ "cur_forcing-datetime" : "1962-02-28T21:00:00" }

followed by

{ "cur_exp-datetime" :  "1972-02-29T00:00:00" }
{ "cur_forcing-datetime" : "1962-02-28T00:00:00" }

So the experiment calendar has correct leap years. During these days, the forcing for the 28th of February is applied twice.
Our spin-up is 2000 years so this should not be an issue.

@nichannah
Copy link
Contributor

Yes, that should work. However I think I'd like to change it so that the comment is necessary. The reason for this is that I think it's good to be able to detect typos.

nichannah added a commit to COSIMA/1deg_jra55_iaf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/1deg_jra55_ryf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/025deg_jra55_iaf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/025deg_jra55_ryf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/01deg_jra55_iaf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/01deg_jra55_ryf that referenced this issue Jun 7, 2021
nichannah added a commit to COSIMA/access-om2 that referenced this issue Jun 7, 2021
@aidanheerdegen
Copy link
Contributor

Is there anyway to specify that the time dimension on the perturbation can be MODULO like the CORE-NYF forcings?

         double TIME(TIME) ;
                TIME:units = "days since 1900-01-01 00:00:00" ;
                TIME:axis = "T" ;
                TIME:time_origin = "1-JAN-1900" ;
                TIME:calendar = "NOLEAP" ;
                TIME:modulo = " " ;

That way if users have a 10 year cycle of forcing they don't have to create a file with the same data repeating, and don't have to know how long the run has to be a priori.

@aidanheerdegen
Copy link
Contributor

ping @nichannah @aekiss

@russfiedler
Copy link
Contributor

This was originally a feature for the mighty Ferret. The time unit was assumed to be a year. So setting modulo=10 would be the go. It's not a CF thing.

In netCDF files, the modulo attribute is specified as follows:

  1. Specify the modulo length of the axis with the attribute modulo = , e.g. var:modulo=100;

  2. The modulo attribute from previous netCDF files remains unchanged: modulo = " ". To set a modulo axis of standard length (360 degrees or one year). The modulo length is 360 degrees for a longitude axis, or one year for a climatological time axis.

  3. The attribute value modulo = "FALSE", modulo = "NO", modulo="OFF" tells Ferret that the axis is not to be treated as modulo

@mpudig
Copy link

mpudig commented Aug 5, 2021

Hi @nichannah,

I have come across a potential bug when using the "perturbations" feature in forcing.json files.

I have run an experiment where using the "perturbations" feature I perform a constant in space and time perturbation to the rlds field by 8 (integer) and the tas field by 1.5 (float).

The results were surprising, in particular in the evolution of global average temperature. To investigate, I performed the equivalent experiment where I manually added the equivalent perturbations to the tas and rlds nc files, and there was indeed a difference.

ave_temp_compare

When using "perturbations", the warming is less than when I manually perturbed the forcing files, even though they should be exactly the same.

Interestingly, I have also run experiments with the equivalent doubling (16 to rlds and 8 to tas), using both "perturbations" and manually perturbed forcing files, and the average temperature evolution was the same for both (i.e., "perturbations" and manually perturbed forcing files gave the same results).

Might this divergence be due to some unwanted rounding of the '1.5' float value? What do you think?

The runs (i.e. 8 to rlds and 1.5 to tas) can be found at /home/561/mp2135/access-om2/1deg_jra55_ryf_rcp45warm_x1_spinup500_test (which uses "perturbations") and /home/561/mp2135/access-om2/1deg_jra55_ryf_rcp45warm_x1_spinup500_test1 (which uses the manually changed forcing files).

@rmholmes
Copy link

rmholmes commented Aug 5, 2021

Just to add to Matt's comment here, I performed simulations with the same config files but from cold starts (rather than starting from a spun-up control simulation) and found the same differences. Those runs are at
/scratch/e14/rmh561/access-om2/archive/1deg_jra55v14_ryf_warmtest_file/
and
/scratch/e14/rmh561/access-om2/archive/1deg_jra55v14_ryf_warmtest_pert/

Just to check whether there is some rounding issue going on, I'm currently running an additional run with 1.5 changed to 1 to see if it makes a difference.

@rmholmes
Copy link

rmholmes commented Aug 5, 2021

Yes this must be some rounding/type error. I get exactly the same simulation result whether I use "value": 1.5 or "value": 1 as my constant offset for air temperature.

@nichannah can you track this down?

@nichannah
Copy link
Contributor

nichannah commented Aug 7, 2021

Thank you @mpudig and @rmholmes. Yes you're correct, it was rounding - also the tests were only doing scale/offset using integers. I'll push a fix and let you know when it is there.

@nichannah
Copy link
Contributor

I haven't yet had time to update all of the experiments, however I've pushed the fixes to YATM and tested. You can find the new executable here: /g/data/ik11/inputs/access-om2/bin/yatm_a28ed21.exe

@mpudig
Copy link

mpudig commented Aug 10, 2021

Great! Thanks @nichannah.

nichannah added a commit to nichannah/1deg_era5 that referenced this issue Sep 15, 2021
@mauricehuguenin
Copy link

mauricehuguenin commented Oct 8, 2021

Hi all,
I would like to use the additive forcing function with the temporal and spatial options, i.e. using

    {
      "filename": "INPUT/RYF.uas.1990_1991.nc",
      "fieldname": "uas",
      "cname": "uwnd_ai",
      "perturbations": [
        {
          "type": "separable",
          "dimension": ["temporal", "spatial"],
          "value": ["INPUT/N34_time_series_yr1.nc", "INPUT/N34_regressed_anom_field_uas.nc"],
          "calendar": "forcing",
          "comment": "N34(t) x EOF1(x,y)"
        },
      ]
    },

The goal is to add a time-varying perturbation to the RYF fields that consists of the N34(t) index multiplied with a constant spatial pattern uas(x,y). To include the new input files I have added

      input:
            - /g/data/e14/mv7494/JRA55-do-1-5-0

to the config.yaml file (line 26) and I have also added to config.yaml the storage flags:

qsub_flags: -lother=hyperthread -W umask=027 -l storage=gdata/hh5+gdata/ik11+scratch/e14+gdata/ua8+gdata/e14 as well.

Unfortunately the run exits with the error warning: payu: error: Input directory /g/data/e14/mv7494/JRA55-do-1-5-0 not found; aborting. even though the path and files do exist. I am using the latest 025deg_jra55_ryf configs from the master branch. Ryan mentioned the model not finding the input directories might come from payu.

@aidanheerdegen
Copy link
Contributor

Storage flags added to qsub_flags are ignored. payu automatically determines the required storage flags by examining paths in the manifests.

Just run

payu setup

to populate the manifests, then payu sweep to delete the work directory and run as normal.

@mauricehuguenin
Copy link

Thank you Aidan. It looks like an attribute is missing in one of the input files because the model aborts with:

Error - from NetCDF library
ncvar inquire varid: time_bnds
NetCDF: Variable not found

My perturbation time series /g/data/e14/mv7494/JRA55-do-1-5-0/N34_time_series_uas_yr1.nc has a time_bnds attribute but the spatial pattern /g/data/e14/mv7494/JRA55-do-1-5-0/N34_regressed_anom_field_uas.nc is only 2D and does not have such an attribute.

@nichannah Do you still have the .nc files you used in #30 (comment)? That could help me find the specific structure the input files need to have.

@rmholmes
Copy link

@mauricehuguenin I successfully got a temporally-constant, spatially-varying perturbation working at /home/561/rmh561/access-om2/1deg_jra55v14_ryf_rcpWCWCpactest. Maybe checkout the script ACCESS-OM2_OceanHeat_Make_Scaling.ipynb in that folder, and the spatial perturbation file at /g/data/e14/rmh561/access-om2/input/RCP/scaling/RYF.rlds.1990_1991_scalingPAC.nc. Worth a look - but this was an older version of the code without the separable option so I'm not sure if it will help you.

@aidanheerdegen
Copy link
Contributor

@mauricehuguenin I believe the netCDF library is complaining because your variable has a bounds attribute which references a non-existent variable time_bnds.

If you include a bounds attribute you have to have the referenced variable present in the file.

@mauricehuguenin
Copy link

mauricehuguenin commented Oct 13, 2021

I found the issue that caused the model to crash: at some point during my troubleshooting I copied over an atmosphere/forcing.json file that Ryan used and in that file rhuss was used as an input file instead of huss which is now used in the current atmosphere/forcing.json version.

My multi-year simulation with #30 (comment) has now been successful (although I have not yet looked at the output).

@mauricehuguenin
Copy link

Hi all, it has been a while since my last comment. I was focusing on a different project.

My goal is to run a multi-year simulation with a ramp up (year 1) and ramp down (year 2) of spatially constant and time varying anomalies. In atmosphere/forcing.json I use

"perturbations": [
        {
          "type": "separable",
          "dimension": ["temporal", "spatial"],
          "value": ["INPUT/RYF.rsds.1990_1991_time_series_mean_anoms_EN.nc", "INPUT/RYF.rsds.1990_1991_pattern_mean_anoms_EN.nc"],
          "calendar": "experiment",
          "comment": "N34_time_series(t) x spatial_pattern(x,y)"
        },
      ]

When using "calendar": "experiment", the simulation does not include the additional anomalies and the output is identical to a control run. Do you know what might be the issue here? We know from @mpudig that the "calendar": "experiment", option works when only using the "dimension": "temporal", option.

When using "calendar": "forcing",, the simulation successfully includes the additional anomalies but, as expected, cycles through the first year instead of progressing through the anomalous forcing in year 1 followed by year 2.

I run my simulation out of /home/561/mv7494/access-om2/025deg_jra55_ryf_ENFull.

@rmholmes
Copy link

@mauricehuguenin before continuing and just to make sure; have you first checked that your files work with "calendar": "experiment" using just a constant spatial pattern? E.g. does the following work?

"perturbations": [
        {
          "type": "offset",
          "dimension": "temporal",
          "value": "INPUT/RYF.rsds.1990_1991_time_series_mean_anoms_EN.nc",
          "calendar": "experiment",
          "comment": "N34_time_series(t)"
        },
      ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants