Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRISM Data #33

Open
COBrogan opened this issue May 15, 2024 · 12 comments
Open

PRISM Data #33

COBrogan opened this issue May 15, 2024 · 12 comments
Assignees

Comments

@COBrogan
Copy link
Collaborator

COBrogan commented May 15, 2024

Maintained by Oregon State, the PRSIM dataset offers daily, monthly, and normals datasets for precipitation (ppt), mean temp, max temp, min temp, mean dew point, and min/max vapor pressure. From their homepage, "The PRISM Climate Group gathers climate observations from a wide range of monitoring networks, applies sophisticated quality control measures, and develops spatial climate datasets to reveal short- and long-term climate patterns. The resulting datasets incorporate a variety of modeling techniques and are available at multiple spatial/temporal resolutions, covering the period from 1895 to the present." See here for additional documentation and comparison to daymet

PRISM data is available at a 4km resolution. 800 m resolution raster files exist, but we need to purchase these rasters, which will have limits on distribution. For current pricing, see this page on the PRISM website.

PRISM data can be downloaded from the website, through an anonymous FTP site, or through their webservices. The web services is the most secure means to access this data programmatically, but their server traffic limits downloads of the same dataset to just twice so caution is advised when testing. Instructions for using this service can be found here. As noted on the Explorer page, PRISM considers the date downloaded to be the day that ends on the selected date:

"Note: Rather than calculating days based on the 24-hour period ending at midnight local time, PRISM defines a "day" as the 24 hours ending at 12:00 Greenwich Mean Time (GMT, or 7:00am Eastern Standard Time). This means that PRISM data for May 26, for example, actually refers to the 24 hours ending at 7:00am EST on May 26"

Per the PRISM website, data from the past 6 months is provisional. This means they are subject to change as reporting stations finalize their data. Per PRISM,

"The information [for provisional data] is based on whatever data is available from as many of the station networks and data sources as possible, so it changes as new data becomes available or as data quality is improved. The datasets are modeled using climatologically-aided interpolation (CAI), which uses the long-term average pattern (i.e., the 30-year normals) as first-guess of the spatial pattern of climatic conditions for a given month or day."

With PRISM, we need to:

  • Develop a script to download daily data for a given month or year
  • Project and clip rasters to a given extent/mask vector
  • Load into our database as a tiled raster

Code/Examples

Import PRISM using meta-model

  • One day:
# set needed environment vars
MODEL_ROOT=/backup/meteorology/
MODEL_BIN=$MODEL_ROOT
SCRIPT_DIR=/opt/model/model_meteorology/sh
export MODEL_ROOT MODEL_BIN SCRIPT_DIR
# do the import
/opt/model/meta_model/run_model raster_met "2021-01-03" PRISM auto met`
  • Force overwrite in database: DB_FORCE_OVERWRITE=1;export DB_FORCE_OVERWRITE; /opt/model/meta_model/run_model raster_met "2021-01-07" PRISM auto met
  • A whole year:
metsrc="PRISM"
yr=2023
doy=`date -d "${yr}-12-31" +%j`
i=0
while [ $i -lt $doy ]; do
  thisdate=`date -d "${yr}-01-01 +$i days" +%Y-%m-%d`
  echo "Running: sbatch /opt/model/meta_model/run_model raster_met \"$thisdate\" $metsrc auto met"
  sbatch /opt/model/meta_model/run_model raster_met "$thisdate" $metsrc auto met
  i=$((i + 1))
done

Inventory PRISM in postgres

  • Note: when using the raster summary st_summarystats function, you can access things like 'min', 'max', and 'mean' with "." (dot) notation. However, the dot notation will only work if you call it from parentheses around the st_summary statement itself, so:
    • (ST_SummaryStatsAgg(met.rast, 1, TRUE)).mean works
    • ST_SummaryStatsAgg(met.rast, 1, TRUE).mean fails with an error ERROR: syntax error at or near "."
select extract(year from to_timestamp(met.tstime)) as year,
  min(to_timestamp(met.tstime)) as start_date,
  max(to_timestamp(met.tstime)) as end_date,
  round(0.0393701 * sum(precip_in)::numeric,1) as precip_in
from (
  select met.tstime,
    (ST_SummaryStatsAgg(met.rast, 1, TRUE)).mean as precip_in
  from dh_feature as mcov
  left outer join dh_variabledefinition as v
  on (
    v.varkey = 'prism_mod_daily'
  )
  left outer join dh_timeseries_weather as met
  on (
    mcov.hydroid = met.featureid and met.varid = v.hydroid
    and met.entity_type = 'dh_feature'
  )
  where mcov.hydrocode = 'cbp6_met_coverage'
  group by met.tstime
) as met
group by extract(year from to_timestamp(met.tstime))
order by extract(year from to_timestamp(met.tstime));

Verify PRISM in postgres

select met.featureid, to_timestamp(met.tstime) as obs_date,
  extract(month from to_timestamp(met.tstime)) as mo,
  round((ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 1, TRUE)).mean::numeric,5) as precip_kgm3,
  round(0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 1, TRUE)).mean::numeric,5) as precip_in,
  round(0.0393701 * (ST_summarystats(st_clip(met.rast, fgeo.dh_geofield_geom), 1, TRUE)).max::numeric,5) as precip_max_in
from dh_feature as f
left outer join field_data_dh_geofield as fgeo
on (
  fgeo.entity_id = f.hydroid
  and fgeo.entity_type = 'dh_feature'
)
left outer join dh_variabledefinition as v
on (
  v.varkey = 'prism_mod_daily'
)
left outer join dh_feature as mcov 
on (
  mcov.hydrocode = 'cbp6_met_coverage'
)
left outer join dh_timeseries_weather as met
on (
  mcov.hydroid = met.featureid and met.varid = v.hydroid 
  and extract(year from to_timestamp(met.tstime)) = 2021
)
where f.hydrocode = 'usgs_ws_01634000'
order by met.tstime;
  • After running tests on d.alpha which imported the 1st-5th (6th failed due to me downloading it a bunch of times and getting negged by PRISM).
  • Returns:
    • note: zero days in early part may be due to bad downloads/imports
 featureid |        obs_date        | mo |      precip_kgm3      |       precip_in
-----------+------------------------+----+-----------------------+------------------------
    616654 | 2021-01-01 19:00:00-05 |  1 |    21.897387758890787 |     0.8621023458063062
    616654 | 2021-01-02 19:00:00-05 |  1 |    1.5953182965517043 |    0.06280784086707025
    616654 | 2021-01-03 19:00:00-05 |  1 |    1.2646758019924165 |   0.049790412792021635
    616654 | 2021-01-04 19:00:00-05 |  1 |  0.014757499457725013 |  0.0005810042294005795
    616654 | 2021-01-06 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-07 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-08 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-09 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-10 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-11 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-12 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-13 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-14 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-15 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-16 19:00:00-05 |  1 |                     0 |                      0
    616654 | 2021-01-17 19:00:00-05 |  1 |    0.2218199946578049 |   0.008733075371677244
    616654 | 2021-01-18 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-19 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-20 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-21 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-22 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-23 19:00:00-05 |  1 |  0.006076666540563261 | 0.00023923896936862965
    616654 | 2021-01-24 19:00:00-05 |  1 |    11.850017201900481 |     0.4665363622405421
    616654 | 2021-01-25 19:00:00-05 |  1 |    11.850017201900481 |     0.4665363622405421

@COBrogan
Copy link
Collaborator Author

COBrogan commented May 15, 2024

Draft shell script to download prism data, project it to 4326, and clip it to a user-input mask vector. Inputs should be year and mask vector filepath. See prism branch.

@rburghol
Copy link
Contributor

@COBrogan thanks for pushing your script up there. I’m going to do some parallel work on NLDAS2 today. I will feedback if I have anything that will cross over into PRISM.

@COBrogan
Copy link
Collaborator Author

Thanks @rburghol . It was easy enough to get the raster into the database. Next steps are:

  • Bring other identifying info into database with the raster i.e. the other fields in dh_timeseries_weather. For now, I'm going over the scripts in dh_weather (like this one)
  • Get mean daily data for a month for each of the river segments to check for completeness
  • Run script to import a year's worth of data
  • QC the year's worth of data
  • Run script for all relevant years (1984 - 2023)

@rburghol
Copy link
Contributor

@COBrogan awesome, check out this issue I just added with respect to the ins and outs of consistent date and time handling #35

@rburghol
Copy link
Contributor

rburghol commented May 16, 2024

@COBrogan - wuestion, is $maskExtent still a file, or is that a WKT string?

@COBrogan
Copy link
Collaborator Author

I forsaw us passing in a file (maybe csv) containing the WKT string. I think -cutline is pretty flexible so we can pass a number of different file types here

@rburghol
Copy link
Contributor

Note - the final raster size after clipping with your file is 426k, which is down from 9 megs from the original tif (after reprojecting). This will make a big difference in data storage.

@COBrogan
Copy link
Collaborator Author

@rburghol that's good news! The clipping is fairly fast too. I've update the script to start getting this data into the database. If we have time tomorrow or Monday, I'd like to sync up on this download/insert script. I have the right idea, but my current code doesn't feel like the approach I should be taking. I've noticed that dh_update_timeseries_weather can probably do some of this work for me, but I thought we were trying to get this done only in shell. Or was it only the downloads we were concerned with?

Either way, I'm getting closer. I also want to learn how to call dh_handletimestamp. I'm guessing this is as easy as calling the module, but I may need directions on where to find that.

@rburghol
Copy link
Contributor

I think that we need to take a crack at developing timestamp handling via shell as it will reduce complexity and enhance modularity for batch automation. We can verify it against the PHP and in the end if the shell method can’t hang, we can isolate the function from PHP and call it from the shell. We can compare notes tomorrow if you have like 30 minutes.

@COBrogan
Copy link
Collaborator Author

I'm updating the scripts such that the download + processing are handled by a function and the import to the database is handled by a separate function. My intention is that the database import function, addRasterToDBase2.sh, is generic enough to be used by all of our data import processes.

@durelles
Copy link

durelles commented May 21, 2024 via email

@rburghol
Copy link
Contributor

@COBrogan This seems like a really great idea. Thanks for modularising and I can't wait to take a look at it!

@rburghol rburghol reopened this Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants