-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathDATA310-applyingheteroskedasticity-robutstandarderror
113 lines (75 loc) · 4.02 KB
/
DATA310-applyingheteroskedasticity-robutstandarderror
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# DATA-3100
# Adefoluke Shemsu
setwd("~/Documents/Education/Penn/Classes/DATA 310/Week 7")
library(tidyverse)
library(sandwich)
# This question will have you answer the following questions about the R code “Robust- StdErrors.R”. This code runs a
# simulation designed to test the performance of robust standard errors to correct for heteroscadasticity
# relative to the OLS standard errors that assume no heteroscadasticity or autocorrelation.
# *Pasting in my HW doc for easier reference*
# Sets the seed
set.seed(12221979)
#
# Parameters for simulation
#
numsims <- 1000
numobs <- 500
alpha <- 1
beta <- 2
lb <- .025
ub <- .975
# Initializes a matrix with numobs rows and two columns
data <- matrix(-9, nrow = numobs, ncol = 3)
# Sets the value of the modeled variable
data[,1] <- runif(numobs, min = 0, max = 1)
numwithinCIols <- 0
numwithinCIrobust <- 0
for (s in 1:numsims) {
#Construct the value of the error term
data[,2] <- rnorm(numobs, mean=0, sd = data[,1])
# Construct the value of the outcome variable
data[,3] <- alpha + beta*data[,1] + data[,2]
data.df <- as.data.frame(data)
plot(data.df$V1, data.df$V3)
temp <- lm(V3 ~ V1, data = data.df)
# Extracts the slope coefficient on the modeled variables
betahat <- temp$coefficients[2]
# Extracts the ols standard error on the slope coefficient on the modeled variables
tempvar <- vcov(temp)
beta_se_ols <- tempvar[2, 2]^(1/2)
# Constructs the lower bound on the CI on beta using the OLS standard error
lower_bound_ols <- betahat + beta_se_ols*qt(lb, numobs - 2)
# Constructs the upper bound on the CI on beta using the OLS standard error
upper_bound_ols <- betahat + beta_se_ols*qt(ub, numobs - 2)
# Is the true beta within the confidence interval construted using OLS standard errors
withinCI <- ((lower_bound_ols < beta) & (upper_bound_ols > beta))
# Keeps running total of whether true beta is within the ols confidence interval
numwithinCIols <- numwithinCIols + withinCI
# Extracts the White standard error on the slope coefficient on the modeled variables
tempvar <- vcovHC(temp, type="HC1")
beta_se_robust <- tempvar[2, 2]^(1/2)
# Constructs the lower bound on the CI on beta using the OLS standard error
lower_bound_robust <- betahat + beta_se_robust*qt(lb, numobs - 2)
# Constructs the upper bound on the CI on beta using the OLS standard error
upper_bound_robust <- betahat + beta_se_robust*qt(ub, numobs - 2)
# Is the true beta within the confidence interval construted using robust standard errors
withinCI <- ((lower_bound_robust < beta) & (upper_bound_robust > beta))
# Keeps running total of whether true beta is within the robust confidence interval
numwithinCIrobust <- numwithinCIrobust + withinCI
}
print(numwithinCIols / numsims)
print(numwithinCIrobust / numsims)
#• Lines 8-13 of the code define a number of parameters that are used within simu- lation. Trace through the code and
# describe what role each of these parameters plays in structuring the simulation.
#• Explain what part of the code creates heteroscadasticity in the unobserved vari- ables when running the regression
# on line 33.
#• Line 49 defines a variable called “withinCI”. Highlight what values that this vari- able can take on and under
# what circumstances it takes each of the potential values.
#• Explain what the values of “numwithinCIols” and “numwithinCIrobust” represent on lines 52 and 67 of the code,
# respectively.
#• Run the simulation. What do you conclude about the appropriateness of using OLS and robust
# standard errors when analyzing these data based on the values of ‘numwithinCIols” and
# “numwithinCIrobust” that the simulation generates?
#• Edit the code in a way such that the unobservable variables are homoscadastic and rerun the simulation.
# Based on these results, what do you conclude about the best course of action when you are uncertain whether
#the unobserved variables are homoscadastic or heteroscadastic and you have a sizable amount of data?