Time Series I: Trends
We'll discuss first the nature of different data methods (cross-sectional, longitudinal, panel, time series).
With time series data, we need to become familiar with some basic terms and concepts before we can discuss how to analyze it.
* stochastic process
* "random walk"
* moving average
* stationarity
* autocorrelation
* differencing
The process of modeling time series data has three parts:
a) specification
b) fitting
and c) diagnostics.
R script:
# http://www.courseserve.info/files/SOCY7113trends.r
# SOCY7113trends.r
# Load libraries.
install.packages("TSA")
library(TSA)
#Let's look at a plot of a series.
data(wages, package="TSA")
plot(wages)
# We can get a sense of whether or not the observations are
# independent by looking at the relationship between each
# observation and the one previous.
plot(zlag(wages),wages,xlab="Wages (t-1)",ylab="Wages (t)"
# Some series have an element called seasonality.
data(tempdub, package="TSA")
plot(tempdub,ylab='Temperature',type='o')
# We can compare our series to a random process.
# The rwalk dataframe contains a simulated random walk.
data(rwalk, package="TSA")
plot(rwalk,type='o',ylab='Random Walk')
model1=lm(rwalk~time(rwalk))
summary(model1)
# add the fitted least squares line
abline(model1)
# Now let's look at a model for a seasonal trend.
# season(tempdub) creates a vector of the month index of the data as a factor
data(tempdub, package="TSA")
month=season(tempdub)
model2=lm(tempdub~month-1) # -1 removes the intercept term
summary(model2)
model3=lm(tempdub~month) # intercept is automatically included so one month (Jan) is dropped
summary(model3)
# We can try to model the data to a harmonic trend.
har=harmonic(tempdub,1)
model4=lm(tempdub~har)
summary(model4)
plot(ts(fitted(model4),freq=12,start=c(1964,1)),ylab='Temperature',type='l',
ylim=range(c(fitted(model4),tempdub)))
points(tempdub)
Group exercise
Using one of the dataframes in the TSA package, produce a plot of the series and then fit a model to it (using the lm() command, as above). Report the results.
You can see a list of time series dataframes with the following command:
try(data(package="TSA"))
and you can read documentation about a particular package:
?"dataframe", where you type the name of the dataframe instead of "dataframe", i.e., ?wages