Skip to content

Commit

Permalink
TOPMODEL artifact (#74)
Browse files Browse the repository at this point in the history
  • Loading branch information
kmdeck authored Nov 20, 2024
1 parent bbef42b commit 13ffe2e
Show file tree
Hide file tree
Showing 6 changed files with 222 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ S. Gupta et al 2022, 2024](https://github.com/CliMA/ClimaArtifacts/tree/main/soi
- [Initial conditions for summer DYAMOND simulations](https:////github.com/CliMA/ClimaArtifacts/tree/main/atmos_dyamond_summer)
- [ERA5 Land Forcing Data 2008](https:////github.com/CliMA/ClimaArtifacts/tree/main/era5_land_forcing_data2008)
- [ERA5 Monthly Averages on Pressure Levels 1979-2024](https:////github.com/CliMA/ClimaArtifacts/tree/main/era5_monthly_averages_pressure_levels_1979_2024)
- [TOPMODEL topographic index statistics](https:////github.com/CliMA/ClimaArtifacts/tree/main/topmodel)

# The ultimate guide to ClimaArtifacts

Expand Down
6 changes: 6 additions & 0 deletions topmodel/OutputArtifacts.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[topmodel]
git-tree-sha1 = "10855ff977cc948a92c7a83915895d27650649a1"

[[topmodel.download]]
sha256 = "03de8f42fedf3836c6792571f04b55fbaca321919ecd761b846286dc56bfa2e8"
url = "https://caltech.box.com/shared/static/bfjxp72vre6b5ncm7jqf9w8hyj4ab4yr.gz"
3 changes: 3 additions & 0 deletions topmodel/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[deps]
ClimaArtifactsHelper = "6ffa2572-8378-4377-82eb-ea11db28b255"
NCDatasets = "85f8d34a-cbdd-5861-8df4-14fed0d494ab"
34 changes: 34 additions & 0 deletions topmodel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Topographic Index Statistics for TOPMODEL Runoff
This artifact includes topographic index statistics derived from a high-resolution map of topographic index, found here:
Marthews, T.R., Dadson, S.J., Lehner, B., Abele, S., Gedney, N. (2015). High-resolution global topographic index values. NERC Environmental Information Data Centre. (Dataset). https://doi.org/10.5285/6b0c4358-2bf3-4924-aa8f-793d468b92bev

The raw data (high-resolution map) can be downloaded with this link: https://catalogue.ceh.ac.uk/download/6b0c4358-2bf3-4924-aa8f-793d468b92be?url=https%3A%2F%2Fcatalogue.ceh.ac.uk%2Fdatastore%2Feidchub%2F6b0c4358-2bf3-4924-aa8f-793d468b92be%2F6b0c4358-2bf3-4924-aa8f-793d468b92be.zip

but please note that you will be prompted to make an account with the UK Center for Ecology and Hydrology. The raw data is 12GB in size.

Citation for data:
Marthews, T.R., Dadson, S.J., Lehner, B., Abele, S., Gedney, N. (2015). High-resolution global topographic index values. NERC Environmental Information Data Centre. (Dataset). https://doi.org/10.5285/6b0c4358-2bf3-4924-aa8f-793d468b92be

The procedure for creating the map is described in the following:
Marthews, T. R., Dadson, S. J., Lehner, B., Abele, S., and Gedney, N.: High-resolution global topographic index values for use in large-scale hydrological modelling, Hydrol. Earth Syst. Sci., 19, 91–104, https://doi.org/10.5194/hess-19-91-2015, 2015.

To create the data required by the TOPMODEL runoff scheme, we first download the data. Then, we run
`julia --project create_artifacts.jl nc_path`, where nc_path is the path to the downloaded data on your local machine.
This julia script partitions the Earth's surface into a grid of 1degree x 1degree areas. It computes
certain statistics of the topographic index using each point of the higher resolution map within each
1degree x 1 degree area. It therefore creates a
lower resolution (1degree x 1degree) map of each statistic. The final data product contains
data supplied by Natural Environment Research Council.

For more information on TOPMODEL, please see e.g.: Niu, Guo‐Yue, et al. "A simple TOPMODEL‐based runoff parameterization (SIMTOP) for use in global climate models." Journal of Geophysical Research: Atmospheres 110.D21 (2005).

License:
This resource is available under the Open Government Licence (OGL)

You must always use the following attribution statement to acknowledge the source of the information: "Contains data supplied by Natural Environment Research Council."

You must include any copyright notice identified in the metadata record for the Data on all copies of the Data, publications and reports, including but not limited to, use in presentations to any audience.

You will ensure that citation of any relevant key publications and Digital Object Identifiers identified in the metadata record for the Data are included in full in the reference list of any reports or publications that describe any research in which the Data have been used.

This product, High-resolution global topographic index values, has been created with use of data from the HydroSHEDS database which is © World Wildlife Fund, Inc. (2006-2013) and has been used herein under license. WWF has not evaluated the data as altered and incorporated within, High-resolution global topographic index values, and therefore gives no warranty regarding its accuracy, completeness, currency or suitability for any particular purpose. Portions of the HydroSHEDS database incorporate data which are the intellectual property rights of © USGS (2006-2008), NASA (2000-2005), ESRI (1992-1998), CIAT (2004-2006), UNEP-WCMC (1993), WWF (2004), Commonwealth of Australia (2007), and Her Royal Majesty and the British Crown and are used under license. The HydroSHEDS database and more information are available at http://www.hydrosheds.org.
80 changes: 80 additions & 0 deletions topmodel/create_artifacts.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
using NCDatasets
using Statistics
using ClimaArtifactsHelper

if length(ARGS) < 1
@error("Please provide the local path to the raw data nc file.")
else
nc_path = ARGS[1]
end

outputdir = "topmodel"
if isdir(outputdir)
@warn "$outputdir already exists. Content will end up in the artifact and may be overwritten."
@warn "Abort this calculation, unless you know what you are doing."
else
mkdir(outputdir)
end

include("utils.jl")

"""
create_artifact(nc_path, resolution, attribs, statistics, outfile_path)
This function reads in the raw data at `nc_path` and computes a set of
statistics at a resolution, in degrees, of `resolution`. These statistics are
supplied as a list of functions `statistics` of the form x::AbstractArray -> f(x)::Float.
The computed statistics are then saved as an NetCDF file to `outfile_path`, with the `attribs`
given. The attribs include global attribs for the file, as well as attribs (varname, varlongname, varunits)
for each statistic computed.
For example, if you would like to create a map of the mean topographic index at 1 degree resolution,
you could do:
nc_path = path_to_raw_data.nc
outfile_path = path_to_statistics.nc
resolution = 1.0
statistics = (x -> mean(x),)
var_attribs = ((; varlongname = "The mean topographic index",
varunits = "unitless",
varname = "ti_mean",
),)
attribs = (; global attrib, var_attribs)
create_artifact(nc_path, resolution, attribs, statistics, outfile_path)
"""
function create_artifact(nc_path, resolution, attribs, statistics, outfile_path)
data = NCDataset(nc_path)
lat = data["lat"][:];
lon = data["lon"][:];
ti = data["Band1"][:,:];
close(data)

outdata, outlat, outlon =
regrid_and_compute_statistics(ti, (lon, lat), resolution, statistics);
write_nc_out(outdata, outlat, outlon, attribs, outfile_path)
Base.mv(outfile_path, joinpath(outputdir, outfile_path))
end
# Simulation Resolution
resolution = 1.0 # degree
# What we want to use to aggregate the topographic index values in the resolution x resolution grid boxes
statistics = ((x) -> sum(x .> mean(x)) ./ length(x), (x) -> 1)
outfile_path = "topographic_index_statistics_$(resolution)x$(resolution).nc"
data = NCDataset(nc_path)
global_attrib = copy(data.attrib)
close(data)

var_attribs = ((;
varlongname = "The maximum saturated fraction from TOPMODEL",
varunits = "unitless",
varname = "fmax",
),
(;
varlongname = "Land sea mask estimated from topographic index map",
varunits = "unitless",
varname = "landsea_mask",
),
)

create_artifact(nc_path, resolution, (;global_attrib, var_attribs), statistics, outfile_path)

create_artifact_guided(outputdir; artifact_name = basename(@__DIR__))
98 changes: 98 additions & 0 deletions topmodel/utils.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""
regrid_and_compute_statistic(data, (lon, lat), resolution, statistics)
Regrids `data` to the resolution specified by computing statistics of interest
over all data points in the simulation grid boxes of size resolution.
This function takes `data` (size (nlon, nlat)) defined at the points specified by the arrays of `lon` (length nlon) and `lat` (length nlat),
as well as a desired `resolution` in degrees. The resolution should
correspond to grid cells larger than 1kmx1km (the resolution of `data`).
For each coarse grid cell, the statistics are computed over the data in that cell,
ignoring points in lakes or oceans.
They must be a function of the form x::AbstractArray - > f(x)::Float.
"""
function regrid_and_compute_statistics(data, (lon, lat), resolution, statistics)
(lat_min, lat_max) = extrema(lat)
(lon_min, lon_max) = extrema(lon)

lat_count = Int(ceil((lat_max - lat_min) / resolution)) + 1
lon_count = Int(ceil((lon_max - lon_min) / resolution)) + 1

lat_grid = range(stop = lat_max, step = resolution, length = lat_count)
lon_grid = range(stop = lon_max, step = resolution, length = lon_count)

n_stats = length(statistics)
outdata = zeros(Float32, lon_count, lat_count, n_stats)

# Preallocation speeds things up a bit
lat_mask = BitArray(zeros(size(lat)));
lon_mask = BitArray(zeros(size(lon)));

for (lat_id, lat_val) in enumerate(lat_grid)
for (lon_id, lon_val) in enumerate(lon_grid)
@show lat_id/lat_count
lat_mask .= (lat .>= lat_val) .& (lat .< lat_val + resolution)
lon_mask .= (lon .>= lon_val) .& (lon .< lon_val + resolution)

x = data[lon_mask, lat_mask];
# If `x` is Missing, we are over the ocean, and not over the land.
# Here we make a land mask by checking where x is *not* Missing.
x_land_mask = .!ismissing.(x)

if sum(x_land_mask) / prod(size(x)) > 0.5 # count as land
for (i, stat) in enumerate(statistics)
outdata[lon_id, lat_id, i] = stat(x[x_land_mask])
end
else
outdata[lon_id, lat_id, :] .= 0f0 # all set to zero
end

end
end
return outdata, lat_grid, lon_grid
end

"""
write_nc_out(outdata, outlat, outlon, attribs, outfile_path)
Takes the compute statistics array `outdata`, of size nlon, nlat, nstats,
with corresponding values of lon and lat of `outlat`, `outlon`, and the attributes
for the nc data, and saves the data in NetCDF format to outfile_path
"""
function write_nc_out(outdata, outlat, outlon, attribs, outfile_path)
global_attrib = attribs.global_attrib
curr_history = global_attrib["history"]
new_history =
curr_history *
"; Modified by CliMA for use in ClimaLand models (see topmodel folder in ClimaArtifacts for full changes). Contains
data supplied by Natural Environment Research Council."
global_attrib["history"] = new_history

ds = NCDataset(outfile_path, "c",attrib = global_attrib)
defDim(ds, "lon", size(outdata)[1])
defDim(ds, "lat", size(outdata)[2])


la = defVar(ds, "lat", Float32, ("lat",))
lo = defVar(ds, "lon", Float32, ("lon",))
la.attrib["units"] = "degrees_north"
la.attrib["standard_name"] = "latitude"
lo.attrib["standard_name"] = "longitude"
lo.attrib["units"] = "degrees_east"

la[:] = outlat
lo[:] = outlon

for (i,attrib) in enumerate(attribs.var_attribs)
(varlongname, varunits, varname) = attrib
var = defVar(ds, varname, Float32, ("lon", "lat"))
var.attrib["units"] = varunits
var.attrib["standard_name"] = varname
var.attrib["long_name"] = varlongname
var[:, :] = outdata[:,:,i]
end

close(ds)
end

0 comments on commit 13ffe2e

Please sign in to comment.