Precip Fidelity Project Overview #32

rburghol · 2023-12-13T17:28:14Z

Overview

Project Brief Goals and Outline

DEQ needs a process of identifying time and locations of precipitation input errors, and to use that information to rank varying precipitation inputs according to accuracy, and to create an aggregate dataset from the best available spatially and temporally. This analytical process should be able to integrate into existing DEQ workflows, and Virginia Tech should work with DEQ to design updated workflows where necessary to support the integration of these new analytical processes.

Tasks

DEQ:
- Command line raster download, clip and insert routine (avoid PHP/Drupal if possible)
  - ensure consistent date, handling and storage in database #35
  - PRISM PRISM Data #33
  - NLDAS2 NLDAS Met data Download and Data Model #34
  - daymet Daymet Climate Data #37
  - NOAA drought?
- Raster Mashups #55
  - SQL to mashup a coverage timeseries from 2 rasters based on null values
  - SQL to mashup a coverage timeseries based on multiple rasters and a gage flow agreement rating
  - SQL to disaggregate daily raster (or n-length period) to hourly
  - Can we disaggregate based on gage + lag factor for travel time?
- Raster Performance! #54
  - Tiling impact on import
  - Tiling impact on SQL performance
  - Limit the number of bands impact on import
  - Number of bands impact on SQL performance
  - Storage in/out of the database?
  - Clipping/Excerpting rasters by basin for later processing
Both
- geo work flows:
  - Simple lm (monthly best model)
  - Weekly min(error) lm
  - Storm flow separation weekly lm
  - Re-sample to 1km x 1km ONLY
- amalgamate workflows
  - SQL to find null values and develop a disaggregation scheme from nearest neighbors (involves a raster workflow, maybe nullifying cells and nearest neighboring them till they are all filled with some values)
  - Data model for efficient raster local storage
  - Efficient knitting together of rasters for a full, optimized timeseries.

Diagram

See info on mermaid diagrams: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-diagrams

graph TD
    Rdata(Raster Data Acquisition) --> Rstore(Raster Data Storage in Database)
    USGS(USGS Flow Timeseries By Basin) --> Prec(Precipitation Event Recognition)
    Rts(Raster Data Timeseries By Basin) --> Prec(Precipitation Event Recognition) 
    Rstore --> Rts
    Prec --> Pmiss(Identify Missing Precip Events)
    Prec --> Pover(Identify Extra/Overestimated Precip Events)
    Pmiss --> Rmash(Agglomerate Raster Dataset for Better Fit)
    Pover --> Rmash
    Rstore --> Rmash
    Rmash --> Rstore
    Rstore --> Mmet(Model Meteorology Dataset Creation)
    Mmet --> Mrun(Model Runs by Basin)
    Mrun --> Meval(Comparison of Modeled to Observed)
    USGS --> Meval
    Meval --> Rmash
    Mrun --> Rmash

Introduction

Precipitation timing and magnitude is the single most important factor in water availability, and is therefore a crucial component of hydrologic modeling. Specifically, precipitation timing, magnitude and intensity determines base-flow recharge and riverine system stability and resiliency during drought periods.

Our current generation of models are able to provide 6-18 month minimum flow projections in a large number of Virginia watersheds with base flow cycles of the same duration, however, the uncertainty of those predictions, even in the most well-calibrated watersheds, is unknown. The crucial piece of information needed to inform this is understanding how well the models represent baseflow dynamics over a the multi-year timescale, and this area is heavily dependent on accurate rainfall inputs.

Precipitation spatial variability is high, and while radar-based observations offer a high spatial resolution, they are dependent upon correlation with ground based observations that come from a very sparse, point based monitoring system, resulting in substantial precipitation interpolation. While the likelihood of precipitation errors are well understood, no practical method of quantifying them currently exists for geographically large model domains. Given this inability to quantify these errors directly, methods for detecting the signature of precipitation errors in hydrologic model are needed, however, these methods have not been well established, and therefore, our understanding of the extent to which precipitation errors create hydrologic model errors is poor.

As a result, while we possess models that can provide a quantitative estimate of baseflow resiliency to future multi-year droughts, our ability to define the error bounds of these estimates is hampered by our inability to understand the extent to which model errors are the result of poor model capability (from conceptual limitations or faulty model calibration), or simply a result of erroneous precipitation estimates.

DEQ needs a process of identifying time and locations of precipitation input errors, and to use that information to rank varying precipitation inputs according to accuracy, and to create an aggregate dataset from the best available spatially and temporally. This analytical process should be able to integrate into existing DEQ workflows, and Virginia Tech should work with DEQ to design updated workflows where necessary to support the integration of these new analytical processes.

Pre-Development Steps

~~Construct SOW~~
~~Dissect and understand FEWS capabilities:~~
- ~~Schedule FEW desktop demo from ICPRB, with questions:~~
  - ~~What data sources are available?~~
  - ~~Process of scripting/developing new algorithms if suitable mashups not available?~~
- ~~Schedule DELFT to give us a demo of web app and explanation of why we might use it~~
- ~~Where does the FEWS database live?~~

Project Objectives

Develop method of merging met datasets (build on mash-up)
Develop download and import/processing tools for multiple NOAA met data sets
Develop basic workflows to analyze precip based on gage, and select best model for multiple periods in the model record
- Pose criteria for precip/temp fidelity and mash-up to identify "best met data"
- Base workflow on postgis raster processing to streamline data storage and speed processing.
- Develop workflow in Delft-FEWS online platform
- Prototype in Delft-FEWS desktop
Characterize relationships and types of errors:
- What is monthly minimum effective precipitation at each gage? (i.e., the value of precip that leads to a bump)
- Gage/Met disagreement types:
  - "Phantom Storm" in model (gage has no bump, model does)
  - "Missing Storm" in model (gage bump, model does not)
- Period flow errors (model):
  - Do Gage/Met disagreements correlate with model error?
  - Do years with mean flow errors correlate with precip fidelity? (the CBP model is known to be calibrated to very small error ranges in mean annual flow).
- Ultimately, can we answer "is this error likely due to bad met inputs or model mechanics?"
Evaluate the "best" overall precipitation data sources for various geographic scales for specific model time periods
- At a specific model watershed, we want to identify the precipitation data set that best matches the gage flow in the region. This may change between model timesteps, with different days in the same region relying on different datasets
- The metrics and workflows above should inform this process to determine the appropriate cut-over dates or geographic regions
References:
- "Understanding Precipitation Fidelity in Hydrological Modeling" (Mobley) https://ascelibrary.org/doi/10.1061/%28ASCE%29HE.1943-5584.0000588
- "A new method for establishing hydrologic fidelity of snow depth measurements based on snowmelt–runoff hydrographs" https://www.tandfonline.com/doi/pdf/10.1080/02626667.2018.1438613
- "Environmental Flow Components for Measuring Hydrologic Model Fit during Low Flow Events" (Mobley) https://www.researchgate.net/publication/267777206_Environmental_Flow_Components_for_Measuring_Hydrologic_Model_Fit_during_Low_Flow_Events

Data Model

Proposed model

Use USGS full drainage shapes (in vahydro, bundle='watershed', ftype='usgs_full_drainage')
Create dh_timeseries_weather records for snippet rasters, attached to the usgs_full_drainage feature, for best fit data sources.
- use varkey met_hourly_best_fit
Assemble a full timeseries raster of the baseline data source (whichever datasource ends up being least erroneous), then patch the rasters with best-fit rasters for each usgs_full_draainage record added to the database.

Examples

Merging Data from 2 different precip rasters to get best data set.

See "Combine Daily and WYTD Time Series" in HARPgroup/vahydro#666
See also, NLDAS2 raster to dh: HARPgroup/vahydro#586

Upper James River near Bedford

Under-simulation (e >= -25%) of L90 in 1991, 1998, 2001, 2013, 2014, 2018, 2019, 2020, 2022.
Over-simulation (e >= +25%) of L90 in 1989, 1996, 2000, 2009.
Error in 2019 (drought year) was -38%.
Note: Error in 2005 is reported at +62%, however, the gage did not go into service until October 1st, so comparison is invalid.

Figure 1: Modeled versus observed 90-day low flow at James river USGS 02024752 from 2005-2023.

Year	USGS	Model	pct_error
1984	2477	2387	-4
1985	1579	1614	2
1986	911	840	-8
1987	1198	950	-21
1988	936	909	-3
1989	2569	3211	25
1990	1178	1384	17
1991	757	495	-35
1992	947	968	2
1993	832	728	-12
1994	899	944	5
1995	1105	1148	4
1996	2539	3200	26
1997	857	732	-15
1998	788	454	-42
1999	753	659	-12
2000	1213	1569	29
2001	654	473	-28
2002	667	686	3
2003	3571	4397	23
2004	1807	1694	-6
2005	1207	1272	5
2006	1488	1806	21
2007	729	595	-18
2008	654	658	1
2009	961	1482	54
2010	903	826	-9
2011	1064	983	-8
2012	863	795	-8
2013	1092	822	-25
2014	1017	755	-26
2015	1477	1330	-10
2016	1441	1329	-8
2017	906	793	-12
2018	1947	1347	-31
2019	1042	650	-38
2020	2461	1665	-32
2021	1175	1016	-14
2022	1142	857	-25
2023	1293	1504	16

Gage Stability

A measure of the magnitude of stage-discharge table adjustments made during regular site visits at varying flow level and month.
See: HARPgroup/vahydro#1004
Example: 02025500 James River at Holcomb Rock, VA, gage error at low-flows < 5% (very good gage)

Flow Percentile	January Error	January Flow	January Error	January Flow	March Error	March Flow	April Error	April Flow	May Flow	June Error	June Flow	July Error	July Flow	August Error	August Flow	September Error	September Flow	October Error	October Flow	November Error	November Flow	December Error	December Flow
Less than or equal to 5th percentile	0.00	739	0.00	932	0.00	1,115	0.03	1,591	1,207	0.00	753	-0.02	695	-0.04	540	0.00	487	-0.01	563	0.00	591	-0.04	716
5th-10th percentile	0.01	974	0.00	1,215	0.02	1,570	0.01	1,788	1,457	0.00	927	-0.01	794	-0.05	654	-0.01	640	-0.01	617	0.00	701	-0.02	817
10th-25th percentile	0.01	1,292	0.00	1,663	0.01	2,148	0.01	2,119	1,802	0.00	1,139	-0.01	904	-0.04	763	-0.03	699	-0.02	700	-0.03	811	-0.01	1,035
25th-50th percentile	0.00	2,082	0.01	2,624	0.01	3,751	0.00	3,037	2,657	0.01	1,433	-0.01	1,081	-0.01	926	-0.02	853	0.00	834	-0.01	1,025	0.00	1,648
50th-75th percentile	0.00	3,400	0.01	3,985	0.00	5,822	0.00	5,275	4,110	0.01	2,144	-0.01	1,415	-0.02	1,174	-0.03	1,076	0.00	1,085	-0.02	1,786	0.00	2,990
75th-90th percentile	-0.01	6,783	0.00	6,673	0.00	10,184	0.00	9,352	6,631	0.00	4,008	0.01	2,084	-0.02	1,595	-0.01	1,825	0.00	2,027	0.00	3,759	0.00	5,650
90th-95th percentile	0.00	11,068	0.00	11,420	0.00	17,268	0.00	15,343	10,132	0.00	8,296	0.01	3,204	0.04	2,249	0.00	4,157	0.00	3,278	0.00	6,661	0.00	9,145
Greater than 95th percentile	-0.01	25,636	0.00	28,250	0.00	26,608	0.00	25,518	19,196	0.00	20,075	0.00	7,067	0.02	3,980	0.00	13,023	0.00	7,593	0.00	16,608	0.00	17,200

The text was updated successfully, but these errors were encountered:

rburghol · 2024-04-09T18:01:17Z

COBrogan · 2024-06-07T17:43:54Z

@rburghol A simple batch script for running multiple download and import scripts. Note that I've hard coded the start/end times of the datasets. Probably not necessary given our set-up of the download script.

# set needed environment vars
MODEL_ROOT=/backup/meteorology/
MODEL_BIN=$MODEL_ROOT
SCRIPT_DIR=/opt/model/model_meteorology/sh
MET_SCRIPT_PATH=$SCRIPT_DIR
export MODEL_ROOT MODEL_BIN SCRIPT_DIR MET_SCRIPT_PATH

#Download and import PRISM and daymet rasters between dates as available:
startYear=1983
endYear=2024
#Set availability dates for ease:
daymetStartAvailable=1980
daymetEndAvailable=2023
PRISMStartAvailable=1895
PRISMEndAvailable=2024

for (( YYYY=$startYear ; YYYY<=$endYear ; YYYY++ )); do
	echo "Running download and import sbatch for $YYYY"
	#Daymet download script: Only download daymet data if available
	if [ $YYYY -ge $daymetStartAvailable ] && [ $YYYY -le $daymetEndAvailable ]; then
	metsrc="daymet"
	doy=`date -d "${YYYY}-12-31" +%j`
	#Create a loop that runs a slurm job for the download and import script for each day of the year
	i=0
	
	while [ $i -lt $doy ]; do
	thisdate=`date -d "${YYYY}-01-01 +$i days" +%Y-%m-%d`
	sbatch /opt/model/meta_model/run_model raster_met "$thisdate" $metsrc auto met
	i=$((i + 1))
	done
	
	fi
	
	#PRISM download script: Only download daymet data if available
	if [ $YYYY -ge $PRISMStartAvailable ] && [ $YYYY -le $PRISMEndAvailable ]; then
	metsrc="PRISM"
	doy=`date -d "${YYYY}-12-31" +%j`
	#Create a loop that runs a slurm job for the download and import script for each day of the year
	i=0
	
	while [ $i -lt $doy ]; do
	thisdate=`date -d "${YYYY}-01-01 +$i days" +%Y-%m-%d`
	sbatch /opt/model/meta_model/run_model raster_met "$thisdate" $metsrc auto met
	i=$((i + 1))
	done
	
	fi
done

rburghol · 2024-09-06T11:18:39Z

Prism (points) versus NLDAS2 (blocks) Rapidan River:

rburghol · 2024-09-10T15:18:51Z

Current needs:

Debug memory error again.
Begin coding a amalgamate
REST and detailed analytics
pilots storm flow volume
Overlaps version to match CbP

rburghol changed the title ~~2023/12/13 Modeling CBP/Gopal - Precip Fidelity~~ Precip Fidelity Feb 23, 2024

rburghol assigned rburghol and COBrogan Apr 18, 2024

rburghol transferred this issue from another repository May 15, 2024

COBrogan mentioned this issue Jun 4, 2024

2024/2025 Onboarding Week HARPgroup/HARParchive#1262

Open

7 tasks

rburghol mentioned this issue Jun 6, 2024

Week of 2024/06/10: Project Briefing / Status Meeting HARPgroup/HARParchive#1269

Open

6 tasks

COBrogan assigned gmahadwar Jun 7, 2024

rburghol changed the title ~~Precip Fidelity~~ Precip Fidelity Project Overview Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precip Fidelity Project Overview #32

Precip Fidelity Project Overview #32

rburghol commented Dec 13, 2023 •

edited

Loading

rburghol commented Apr 9, 2024

COBrogan commented Jun 7, 2024 •

edited

Loading

rburghol commented Sep 6, 2024

rburghol commented Sep 10, 2024 •

edited

Loading

Precip Fidelity Project Overview #32

Precip Fidelity Project Overview #32

Comments

rburghol commented Dec 13, 2023 • edited Loading

Overview

Project Brief Goals and Outline

Tasks

Diagram

Introduction

Pre-Development Steps

Project Objectives

Data Model

Proposed model

Examples

Merging Data from 2 different precip rasters to get best data set.

Upper James River near Bedford

Gage Stability

rburghol commented Apr 9, 2024

COBrogan commented Jun 7, 2024 • edited Loading

rburghol commented Sep 6, 2024

rburghol commented Sep 10, 2024 • edited Loading

rburghol commented Dec 13, 2023 •

edited

Loading

COBrogan commented Jun 7, 2024 •

edited

Loading

rburghol commented Sep 10, 2024 •

edited

Loading