<img src="inst/stationaRy_2x.png", width = 100%>
Get the hourly met data you need from a met station located somewhere on Earth.
Get data from a station in Norway (with a USAF value of 13860, and a WBAN value of 99999). Specify the station_id
as a string in the format [USAF]-[WBAN]
, and, provide beginning and ending years for data collection to startyear
and endyear
, respectively.
library(stationaRy)
met_data <- get_isd_station_data(station_id = "13860-99999",
startyear = 2009,
endyear = 2010)
That's great if you know the USAF
and WBAN
numbers for a particular met station. Most of the time, however, you won't have this info. You can search for station metadata using the get_isd_stations
function. Without providing any arguments, it gives you a data frame containing the entire dataset of stations. Currently, there are 27,446 rows in the dataset. Here are rows 250-255 from the dataset:
library(stationaRy)
get_isd_stations()[250:255,]
Source: local data frame [6 x 16]
usaf wban name country state lat lon elev begin end
1 13220 99999 FORDE-TEFRE NO 61.467 5.917 64 1973 2015
2 13230 99999 BRINGELAND NO 61.393 5.764 327 1984 2015
3 13250 99999 MODALEN II NO 60.833 5.950 114 1973 2008
4 13260 99999 MODALEN III NO 60.850 5.983 125 2008 2015
5 13270 99999 KVAMSKOGEN-JONSHOGDI NO 60.383 5.967 455 2006 2015
6 13280 99999 KVAMSKOGEN NO 60.400 5.917 408 1973 2006
Variables not shown: gmt_offset (dbl), time_zone_id (chr), country_name (chr),
country_code (chr), iso3166_2_subd (chr), fips10_4_subd (chr)
This list can be greatly reduced to isolate the stations of interest. One way to do this is to specify a geographic bounding box using lat/lon values to specify the bounds. Let's try a bounding box located in the west coast of Canada.
library(stationaRy)
get_isd_stations(lower_lat = 49.000,
upper_lat = 49.500,
lower_lon = -123.500,
upper_lon = -123.000)
Source: local data frame [20 x 16]
usaf wban name country state lat lon elev begin end
1 710040 99999 CYPRESS BOWL FREESTYLE CA 49.400 -123.200 969.0 2007 2010
2 710370 99999 POINT ATKINSON CA 49.330 -123.265 35.0 2003 2015
3 710420 99999 DELTA BURNS BOG CA 49.133 -123.000 3.0 2001 2015
4 711120 99999 RICHMOND OPERATION CENTRE CA 49.167 -123.067 16.0 1980 2015
5 712010 99999 VANCOUVER HARBOUR CS BC CA 49.283 -123.117 3.0 1980 2015
6 712013 99999 VANCOUVER INTL CA 49.183 -123.167 3.0 1988 1988
7 712025 99999 WEST VAN CYPRESS CA 49.350 -123.183 161.0 1993 1995
8 712045 99999 SAND HEADS (LS) CA 49.100 -123.300 1.0 1992 1995
9 712090 99999 SANDHEADS CS BC CA 49.100 -123.300 NA 2001 2015
10 712110 99999 HOWE SOUND - PAM ROC CA 49.483 -123.300 5.0 1996 2015
11 715620 99999 CYPRESS BOWL SNOWBOARD CA 49.383 -123.200 1180.0 2010 2010
12 716080 99999 VANCOUVER SEA ISLAND CCG CA 49.183 -123.183 2.1 1982 2015
13 716930 99999 CYPRESS BOWL SOUTH CA 49.383 -123.200 886.0 2007 2014
14 717840 99999 WEST VANCOUVER (AUT) CA 49.350 -123.200 168.0 1995 2015
15 718903 99999 PAM ROCKS CA 49.483 -123.283 0.0 1987 1995
16 718920 99999 VANCOUVER INTL CA 49.194 -123.184 4.3 1955 2015
17 718925 99999 VANCOUVER HARBOUR CA 49.300 -123.117 5.0 1977 1980
18 718926 99999 VANCOUVER HARBOUR & CA 49.300 -123.117 5.0 1979 1979
19 728920 99999 VANCOUVER INTL CA 49.183 -123.167 3.0 1973 1977
20 728925 99999 VANCOUVER HARBOUR & CA 49.300 -123.117 5.0 1976 1977
Variables not shown: gmt_offset (dbl), time_zone_id (chr), country_name (chr),
country_code (chr), iso3166_2_subd (chr), fips10_4_subd (chr)
To put these stations on a viewable map, use a magrittr
or pipeR
pipe, to send the output data frame as input to the map_isd_stations
function:
library(stationaRy)
library(magrittr)
get_isd_stations(lower_lat = 49.000,
upper_lat = 49.500,
lower_lon = -123.500,
upper_lon = -123.000) %>%
map_isd_stations()
<img src="inst/stations_map.png", width = 100%>
Upon inspecting the data frame, you can reduce it to a single station by specifying it's name (or part of its name). In this example, we wish to get data from the CYPRESS BOWL SNOWBOARD
station. This can be done by extending with select_isd_station
and using the name
argument to supply part of the station name.
library(stationaRy)
library(magrittr)
get_isd_stations(lower_lat = 49.000,
upper_lat = 49.500,
lower_lon = -123.500,
upper_lon = -123.000) %>%
select_isd_station(name = "cypress bowl")
Several stations matched. Provide a more specific search term.
Source: local data frame [3 x 16]
usaf wban name country state lat lon elev begin end
1 710040 99999 CYPRESS BOWL FREESTYLE CA 49.400 -123.2 969 2007 2010
2 715620 99999 CYPRESS BOWL SNOWBOARD CA 49.383 -123.2 1180 2010 2010
3 716930 99999 CYPRESS BOWL SOUTH CA 49.383 -123.2 886 2007 2014
Variables not shown: gmt_offset (dbl), time_zone_id (chr), country_name (chr),
country_code (chr), iso3166_2_subd (chr), fips10_4_subd (chr)
[1] NA
As this function yielded a data frame with 3 stations (3 stations leading with CYPRESS BOWL
in their station names), a set of strategies will be used to obtain single station. There are two ways to get a year of CYPRESS BOWL SNOWBOARD
data for 2010
: (1) provide the full name of the station, or (2) use the data frame with multiple stations and specify the row of the target station.
library(stationaRy)
library(magrittr)
cypress_bowl_snowboard_1 <-
get_isd_stations(lower_lat = 49.000,
upper_lat = 49.500,
lower_lon = -123.500,
upper_lon = -123.000) %>%
select_isd_station(name = "cypress bowl snowboard") %>%
get_isd_station_data(startyear = 2010, endyear = 2010)
cypress_bowl_snowboard_2 <-
get_isd_stations(lower_lat = 49.000,
upper_lat = 49.500,
lower_lon = -123.500,
upper_lon = -123.000) %>%
select_isd_station(name = "cypress bowl", number = 2) %>%
get_isd_station_data(startyear = 2010, endyear = 2010)
Both statements get the same met data (the select_isd_station
function simply passes a USAF
/WBAN
string to get_isd_station_data
via the %>%
operator). Here's a bit of that met data:
Source: local data frame [711 x 18]
usaf wban year month day hour minute lat lon elev wd ws ceil_hgt temp
1 715620 99999 2010 1 28 16 0 50.633 -128.117 568 NA NA NA 1.3
2 715620 99999 2010 1 28 22 0 50.633 -128.117 568 NA NA NA 2.5
3 715620 99999 2010 1 29 4 0 50.633 -128.117 568 NA NA NA 3.1
4 715620 99999 2010 1 29 10 0 50.633 -128.117 568 NA NA NA 5.5
5 715620 99999 2010 1 29 16 0 50.633 -128.117 568 NA NA NA 3.0
6 715620 99999 2010 1 29 22 0 50.633 -128.117 568 NA NA NA 1.5
7 715620 99999 2010 1 30 4 0 50.633 -128.117 568 NA NA NA 0.3
8 715620 99999 2010 1 30 10 0 50.633 -128.117 568 NA NA NA 0.9
9 715620 99999 2010 1 30 16 0 50.633 -128.117 568 NA NA NA 0.7
10 715620 99999 2010 1 30 22 0 50.633 -128.117 568 NA NA NA -0.1
.. ... ... ... ... ... ... ... ... ... ... .. .. ... ...
Variables not shown: dew_point (dbl), atmos_pres (dbl), rh (dbl), time (time)
If you'd like to get weather data from a weather station at Tofino, BC, CA, it's possible to search using tofino
as the value for the name
argument in the select_isd_stations
function.
library(stationaRy)
library(magrittr)
get_isd_stations() %>%
select_isd_station(name = "tofino")
Several stations matched. Provide a more specific search term.
Source: local data frame [4 x 16]
usaf wban name country state lat lon elev begin end
1 711060 94234 TOFINO AIRPORT CA BC 49.083 -125.767 24.0 1958 2015
2 711060 99999 TOFINO CA 49.082 -125.773 24.4 2000 2004
3 741060 94234 TOFINO CA 49.083 -125.767 20.0 1973 1977
4 999999 94234 TOFINO CA 49.083 -125.767 24.1 1964 1972
Variables not shown: gmt_offset (dbl), time_zone_id (chr), country_name (chr),
country_code (chr), iso3166_2_subd (chr), fips10_4_subd (chr)
[1] NA
A number of stations with tofino
in its name were returned. If the first station in the returned data frame is desired, it can be selected with a slight modification to the select_isd_station
call (using the number
argument). Then, pipe the output to the get_isd_station_data
function and supply the desired period of retrieval.
library(stationaRy)
library(magrittr)
tofino_airport_2005_2010 <-
get_isd_stations() %>%
select_isd_station(name = "tofino", number = 1) %>%
get_isd_station_data(startyear = 2005, endyear = 2010)
That's gives you 34,877 rows of meteorological data from the Tofino Airport station:
Source: local data frame [34,877 x 18]
usaf wban year month day hour minute lat lon elev wd ws ceil_hgt
1 711060 94234 2005 1 1 7 0 49.083 -125.767 24 NA 0.0 1500
2 711060 94234 2005 1 1 8 0 49.083 -125.767 24 NA 0.0 1500
3 711060 94234 2005 1 1 9 0 49.083 -125.767 24 NA 0.0 1500
4 711060 94234 2005 1 1 10 0 49.083 -125.767 24 NA 0.0 1200
5 711060 94234 2005 1 1 11 0 49.083 -125.767 24 80 1.5 1200
6 711060 94234 2005 1 1 12 0 49.083 -125.767 24 NA 0.0 1650
7 711060 94234 2005 1 1 13 0 49.083 -125.767 24 40 2.6 1650
8 711060 94234 2005 1 1 14 0 49.083 -125.767 24 50 3.6 1650
9 711060 94234 2005 1 1 15 0 49.083 -125.767 24 70 1.0 900
10 711060 94234 2005 1 1 16 0 49.083 -125.767 24 60 1.5 1650
.. ... ... ... ... ... ... ... ... ... ... .. ... ...
Variables not shown: temp (dbl), dew_point (dbl), atmos_pres (dbl), rh (dbl), time
(time)
Of course, dplyr works really well to work toward the data you need. Suppose you'd like to collect several years of met data from a particular station and get only a listing of parameters that meet some criterion. Here's an example of obtaining temperatures above 37 degrees Celsius from a particular station:
library(stationaRy)
library(magrittr)
library(dplyr)
high_temps_at_bergen_point_stn <-
get_isd_stations() %>%
select_isd_station(name = "bergen point") %>%
get_isd_station_data(startyear = 2006, endyear = 2015) %>%
select(time, wd, ws, temp) %>%
filter(temp > 37) %>%
mutate(temp_f = (temp * (9/5)) + 32)
#> Source: local data frame [3 x 5]
#>
#> time wd ws temp temp_f
#> 1 2012-07-18 12:00:00 230 1.5 37.2 98.96
#> 2 2012-07-18 13:00:00 220 2.6 37.8 100.04
#> 3 2012-07-18 14:00:00 230 4.1 37.9 100.22
There's actually a lot of extra met data, and it varies from station to station. These additional categories are denoted 'two-letter + digit' identifiers (e.g., AA1
, GA1
, etc.). More information about these observations can be found in this PDF document.
To find out which categories are available for a station, set the add_data_report
argument of the get_isd_station_data
function to TRUE
. This will provide a data frame with the available additional categories with their counts in the dataset.
library(stationaRy)
library(magrittr)
library(dplyr)
get_isd_stations(startyear = 1970, endyear = 2015,
lower_lat = 49, upper_lat = 58,
lower_lon = -125, upper_lon = -120) %>%
select_isd_station(name = "abbotsford") %>%
get_isd_station_data(startyear = 2015,
endyear = 2015,
add_data_report = TRUE)
#> category total_count
#> 1 AA1 744
#> 2 AC1 817
#> 3 AJ1 5
#> 4 AL1 4
#> 5 AY1 248
#> 6 CB1 27
#> 7 CF1 125
#> 8 CI1 560
#> 9 CT1 406
#> 10 CU1 478
#> 11 ED1 27
#> 12 GA1 778
#> 13 GD1 4514
#> 14 GF1 5664
#> 15 IA1 5
#> 16 KA1 744
#> 17 MA1 5748
#> 18 MD1 736
#> 19 MW1 1609
#> 20 OC1 324
#> 21 ST1 38
Want the rainfall in mm units for a particular month? Here's an example where rainfall amounts (over 6 hour periods) are summed for the month of June in 2015 for Abbotsford, BC, Canada. The AA1
data category has to do with rainfall, and you can be include that data (where available) in the output data frame by using the select_additional_data
argument and specifying which data categories you'd like. The AA1_1
column is the duration in hours when the liquid precipitation was observed, and, the AA1_2
column is quantity of rain in mm. The deft use of functions from the dplyr package makes this whole process less painful.
library(stationaRy)
library(magrittr)
library(dplyr)
rainfall_6h_june2015 <-
get_isd_stations(startyear = 1970, endyear = 2015,
lower_lat = 49, upper_lat = 58,
lower_lon = -125, upper_lon = -120) %>%
select_isd_station(name = "abbotsford") %>%
get_isd_station_data(startyear = 2015,
endyear = 2015,
select_additional_data = "AA1") %>%
filter(month == 6, aa1_1 == 6) %>%
select(aa1_2) %>% sum()
[1] 12.5
Want to try this? Make sure you have R, then, use this:
devtools::install_github('rich-iannone/stationaRy')
or this:
install.packages("stationaRy")
Thanks for installing. :)