-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlab-03.rmarkdown
160 lines (98 loc) · 3.87 KB
/
lab-03.rmarkdown
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
title: "lab-03"
author: "Ellya Gholmieh"
format: html
editor: visual
embed-resources: true
theme: hpstr
---
## 1.Read the Data
```{r}
file_path <- "C:/Users/ellya/OneDrive/Desktop/PM566labs/met_all.gz"
met <- data.table::fread(file_path)
```
## 2. Check the dimensions, headers, and footers.
```{r}
dim(met)
```
There are 2,377,343 rows and 30 columns in this data set.
## 3. Take a look at the variables.
```{r}
str(met)
```
As the objectives of this lab are to find the weather station with the highest elevation and to look at patterns in the time series of its wind speed and temperature, the key variables are USAFID, elev (elevation), wind.sp (wind speed), temp (temperature), year, month, day, and hour.
## 4. Take a closer look at the key variables.
```{r}
table(met$USAFID)
table(met$year)
table(met$month)
table(met$day)
table(met$hour)
summary(met$temp)
summary(met$elev)
summary(met$wind.sp)
met[met$elev==9999.0, ] <- NA
summary(met$elev)
```
### At what elevation is the highest weather station?
```{r}
tail(met[order(met$elev), ])
```
The highest weather station is at an elevation of 4113 feet.
### How many missing values are there in the wind.sp variable?
```{r}
summary(met$wind.sp)
```
There are 31582 missing values in the wind.sp variable.
## 5. Check the data against an external data source.
### Where was the location for the coldest temperature readings (-17.2C)? Do these seem reasonable in context?
```{r}
met <- met[met$temp > -40, ]
mettemp <- met[order(met$temp),]
head(mettemp)[,c(1,8:10,24)]
```
The location for the coldest temperature readings corresponds to Yoder, Colorado. These readings do not seem reasonable for August considering for the last 14 years, the minimum temperature in August has not gone below 10 degrees Celsius (<https://www.worldweatheronline.com/v2/weather-averages.aspx?q=80864>)
### Does the range of values for elevation make sense? Why or why not?
```{r}
met <- met[order(met$elev),]
head(met)[,c(1,8:10,24)]
tail(met)[,c(1,8:10,24)]
```
The lowest elevation (-13) corresponds to the Naval Air Facility in Imperial, California. According to AirNav.com, the Naval Air Facility is approximately -13m below sea level. The highest elevation (4113) corresponds to Colorado Mines Peak. However, according to Google maps, the elevation at these coordinates is only 3572m. Thus, the reported maximum elevation is too high and does not make sense.
## 6. Calculate summary statistics
```{r}
elev1 <- met[which(met$elev == max(met$elev, na.rm = TRUE)), ]
summary(elev1)
cor(elev1$temp, elev1$wind.sp, use="complete")
cor(elev1$temp, elev1$hour, use="complete")
cor(elev1$wind.sp, elev1$day, use="complete")
cor(elev1$wind.sp, elev1$hour, use="complete")
cor(elev1$temp, elev1$day, use="complete")
```
## 7. Exploratory graphs
```{r}
hist(met$elev)
hist(met$temp)
hist(met$wind.sp)
library(leaflet)
leaflet(elev1) %>%
addProviderTiles('OpenStreetMap') %>%
addCircles(lat=~lat,lng=~lon, opacity=1, fillOpacity=1, radius=100)
library(lubridate)
elev1$date <- with(elev1, ymd_h(paste(year, month, day, hour, sep= ' ')))
summary(elev1$date)
elev1 <- elev1[order(elev1$date), ]
head(elev1)
```
### Use the plot function to make line graphs of temperature vs. date and wind speed vs. date
```{r}
plot(elev1$date, elev1$temp)
plot(elev1$date, elev1$wind.sp)
```
From the temperature vs. date plot, we can see that temperatures fluctuated throughout the month with a few days of high temperatures from August 19-21. We see that there were also fluctuations in wind speed throughout August, with a highs between August 12 and 19, lows from August 19-24, and highs again between August 25-27.
## 8. Ask Questions
Why is some data inputted incorrectly? Do higher latitude locations have lower temperatures?
```{r}
plot(met$lat, met$temp)
```
Yes, higher latitudes have more variation in their temperatures and have a lower average temperature.