-
Notifications
You must be signed in to change notification settings - Fork 15
/
7-PanelModels.R
147 lines (120 loc) · 4.89 KB
/
7-PanelModels.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
#'
#' #### Example Exercise: Grunfeld Investment data
#' This data consists of 10 large US manufacturing firms from 1935 to 1954.
#'
#' **Your Task:** Analyze the many types of panel models.
#'
#' This code was based on the paper: Croissant, Y., Milo, G.(2008). [_Panel Data Econometrics in R: The plm Package_](https://www.jstatsoft.org/index.php/jss/article/view/v027i02/v27i02.pdf), Journal of Statistical Software, 27(2).
#'
#' ## Data
#' #### Variables:
#'
#' * `invest`: Gross investment, defined as additions to plant and equipment plus maintenance and repairs in millions of dollars deflated by the implicit price deflator of producers' durable equipment (base 1947);
#' * `value`: Market value of the firm, defined as the price of common shares at December 31 (base 1947);
#' * `capital`: Stock of plant and equipment, defined as the accumulated sum of net additions to plant and equipment deflated by the implicit price deflator for producers' durable equipment (base 1947);
#' * `firm`: American manufacturing firms;
#' * `year`: Year of data;
#' * `firmcod`: Numeric code that identifies each firm.
#'
#' ## Startup
#' #### Import libraries
library(readxl) #read excel files
library(skimr) #summary statistics
library(foreign) #panel data models
library(plm) # Lagrange multiplier test and panel models
#'
#' ##### Import dataset
data <- read_excel("Data/Grunfeld_data.xlsx")
df <- data.frame(data)
#'
#' ##### Take a first look at your data
#' #### Prepare your data
#' ##### Take out the "firmcod" variale from the dataset
df$firmcod = NULL #another way or removing a variable
#'
#' ##### Factor categorical variables
#' `firm` is a categorical nominal variable, and should be treated as so in the modeling processes. And for this example `year` should also be considered as a categorical ordinal variable, instead of a continuous one.
df$firm = factor(df$firm)
df$year = factor(df$year, ordered = T)
#' Take a look at the summary of your data. See the differences regarding the categorical ones?
summary(df)
#'
#' ## Ordinary least square model
#' First run an Ordinary least square model without the `firm` variable. Compare the results from this model to the many panel data models.
mlr = lm(invest ~ value + capital, data = df)
summary(mlr)
#'
#' ## Panel Data Models
#'
#' Panel data models use __one way__ and __two way__ component models to overcome heterogeneity, correlation in the disturbance terms, and heteroscedasticity.
#'
#' * **One way error component model:** variable-intercept models across individuals **or** time;
#' * **Two way error component model:** variable-intercept models across individuals **and** time.
#'
#' Modelling Specifications:
#'
#' * **With fixed-effects:** effects that are in the sample. Fixed-effects explore the causes of change within a person or entity (In this example the entity is the _firms_);
#'
#' * **With random-effects:** effect randomly drawn from a population. The random effects model is an appropriate specification if we are drawing _n_ individuals randomly from a large population.
#'
#'
#' You can also try other types of model estimation:
#'
#'
#' See `?plm` for more options, regarding the effects and instrumental variable transformation types.
#'
#'
#' ### One way
#' ##### One way fixed effects model
fixed = plm(
invest ~ value + capital,
data = df,
index = c("firm", "year"), #panel settings
model = "within" #fixed effects option
)
summary(fixed)
#'
#' ##### One way random effects model
random = plm(
invest ~ value + capital,
data = df,
index = c("firm", "year"),
model = "random" #random effects option
)
summary(random)
#'
#' #### Haussman test
#' Use the Hausman test to evaluate when to use fixed or random effects
phtest(random, fixed)
#'
#' > **Note:** The null hypothesis is that random effect model is more appropriate than the fixed effect model.
#'
#' ### Two-way
#' ##### Two-way Fixed effects model
fixed_tw <-
plm(
invest ~ value + capital,
data = df,
effect = "twoways", #effects option
model = "within", #fixed
index = c("firm", "year") #panel settings
)
summary(fixed_tw)
#'
#' ##### Two-way Random effects model
random_tw <-
plm(
invest ~ value + capital,
data = df,
effect = "twoways",
model = "random",
index = c("firm", "year"),
random.method = "amemiya"
)
summary(random_tw)
#'
#' #### Lagrange Multiplier Test
#' The Lagrange multiplier statistic, is used to test the null hypothesis that there are no group effects in the Random Effects model.
#' Large values of the Lagrange Multiplier indicate that effects model is more suitable than the classical model with no common effects.
plmtest(random_tw)
#' >**Note:** Large values of H indicate that the fixed effects model is prefered over the random effects model. While, A large value of the LM statistic in the presence of a small H statistic indicate that the random effects model is more suitable.