Highly Persistent Time Series

Zhentao Shi

Sep 20, 2021

Efficient market hypothesis

Unit root AR(1)

\[ y_t = y_{t-1} + \epsilon_t \] where \(\epsilon_t \sim \mathrm{iid} (0, \sigma^2)\)

Simulated Example

set.seed(2021-9-20)
n <- 100
x <- rnorm(n)
y <- cumsum(x)
plot(y, type = "l")

Properties

Compare with stationary AR(1)

\[ \begin{align} y_{t+h} & = \beta ( \beta y_{t+h-2} + \epsilon_{t+h-1}) + \epsilon_{t+h} \\ & = \cdots \\ & = \beta^h y_{t} + \sum_{q=0}^{h-1} \beta^{q} \epsilon_{t+h-q} \end{align} \]

Consequences of unit root

Integrated time series

ARIMA in R

n = 100000
y <- arima.sim( n = n, list(order = c(2,1,2), ar = c(0.1, 0.1), ma = c(0.3, 0.1) ) )
plot(y)

arima(y, order = c(2,1,2))
## 
## Call:
## arima(x = y, order = c(2, 1, 2))
## 
## Coefficients:
##          ar1     ar2     ma1     ma2
##       0.1173  0.0793  0.2817  0.1159
## s.e.  0.0429  0.0246  0.0428  0.0099
## 
## sigma^2 estimated as 0.9966:  log likelihood = -141722.9,  aic = 283455.7

I(2) simulation and estimation

n = 100000
y <- arima.sim( n = n, list(order = c(2,2,2), ar = c(0.1, 0.1), ma = c(0.3, 0.1) ) )
plot(y)

arima(y, order = c(2,2,2))
## 
## Call:
## arima(x = y, order = c(2, 2, 2))
## 
## Coefficients:
##          ar1     ar2     ma1     ma2
##       0.1165  0.0913  0.2908  0.1029
## s.e.  0.0415  0.0235  0.0415  0.0093
## 
## sigma^2 estimated as 0.9988:  log likelihood = -141832.4,  aic = 283674.8

Real data example

SPX <- quantmod::getSymbols("^GSPC",auto.assign = FALSE, from = "2000-01-01")$GSPC.Close
plot(SPX)

lSPX <- log(SPX)
plot(lSPX)

save(SPX, lSPX, file = "lSPX.Rdata")
dSPX <- diff( log(SPX) ) 
plot(dSPX)

Test unit root

\[ y_t = \beta y_{t-1} + \epsilon_t \] where \(\epsilon_t \sim \mathrm{iid} (0, \sigma^2)\).

\[ H_0: \beta =1, \] which means that the time series \(y_t\) is a unit root process.

\[ H_1: |\beta| < 1, \] which means \(y_t\) is stationary. (Economists don’t really care about \(\beta < -1\).)

\(t\)-statistic

From OLS, we have the \(t\)-statistic

\[ t_{\beta} = (\hat{\beta} - 1) / \mathrm{se}(\hat{\beta}) \]

Alternative representation

\[ \Delta y_t = \gamma y_{t-1} + \epsilon_t, \] where \(\gamma = \beta - 1\).

\[ H_0: \gamma =0, \] versus the alternative hypothesis

\[ H_1: \gamma < 0 \]

\[ t_{\gamma} = \hat{\gamma} / \mathrm{se}(\hat{\gamma}) \]

Dicky-Fuller test

Dicky-Fuller test

Distribution of DF test

set.seed(2021-9-22)
library(dynlm, quietly = TRUE)
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
DF.sim = function(ar){
  Rep = 2000
  n = 100
  
  t.stat = rep(0, Rep)
  
  for (r in 1:Rep){
    if (ar < 1) {
      y = arima.sim( model = list(ar = ar), n = n)
      reg.dyn = dynlm( y  ~  L(y,1)-1 )
      t.stat[r] = (summary(reg.dyn)[[4]][1] - ar) / summary(reg.dyn)[[4]][2]
    } else if (ar == 1){
      y = ts( cumsum( rnorm(n) ) )
      reg.dyn = dynlm( diff(y) ~ L(y,1)-1 )
      t.stat[r] = summary(reg.dyn)[[4]][3]      
    }
  }
  return(t.stat)
  print("simulation is done with ar = ", ar, "\n")
}


B = DF.sim(1)
plot(density(B), col = "black", xlim = c(-4, 4))

B = DF.sim(0.5)
lines(density(B), col = "blue")

B = DF.sim(0.9)
lines( density(B) , col = "purple" )

xgrid = seq(-4, 4, by = 0.01)
lines( x = xgrid, dnorm(xgrid), col = "black", lty = 2 )
abline( v=0, lty = 3)

Interpret the screen print

library(urca, quietly = TRUE)
n <- 100
y <- arima.sim( n = n, list(order = c(0,1,0) ) )
DFtest <- ur.df( y, type = "none", lags = 0 )
summary(DFtest)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.22297 -0.62445  0.02881  0.81155  2.30497 
## 
## Coefficients:
##         Estimate Std. Error t value Pr(>|t|)
## z.lag.1  0.01430    0.01219   1.173    0.244
## 
## Residual standard error: 1.025 on 99 degrees of freedom
## Multiple R-squared:  0.01371,    Adjusted R-squared:  0.00375 
## F-statistic: 1.376 on 1 and 99 DF,  p-value: 0.2435
## 
## 
## Value of test-statistic is: 1.1732 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

Interpret the screen print (continue)

library(urca, quietly = TRUE)
n <- 100
y <- arima.sim( n = n, list(ar = 0.5 ) )
DFtest <- ur.df( y, type = "none", lags = 0 )
summary(DFtest)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.64746 -0.66698 -0.06205  0.41218  2.20997 
## 
## Coefficients:
##         Estimate Std. Error t value Pr(>|t|)    
## z.lag.1 -0.56473    0.09021   -6.26 1.02e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9927 on 98 degrees of freedom
## Multiple R-squared:  0.2856, Adjusted R-squared:  0.2784 
## F-statistic: 39.19 on 1 and 98 DF,  p-value: 1.02e-08
## 
## 
## Value of test-statistic is: -6.26 
## 
## Critical values for test statistics: 
##      1pct  5pct 10pct
## tau1 -2.6 -1.95 -1.61

Real data example

library(urca); 
DFtest <- ur.df( lSPX, type = "none", lags = 0 )
summary(DFtest)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.127880 -0.004906  0.000415  0.005602  0.109376 
## 
## Coefficients:
##          Estimate Std. Error t value Pr(>|t|)
## z.lag.1 2.878e-05  2.248e-05    1.28    0.201
## 
## Residual standard error: 0.01238 on 5544 degrees of freedom
## Multiple R-squared:  0.0002955,  Adjusted R-squared:  0.0001152 
## F-statistic: 1.639 on 1 and 5544 DF,  p-value: 0.2005
## 
## 
## Value of test-statistic is: 1.2802 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

Estimation with differenced data

\[ (y_t - y_{t-1}) = \beta (y_{t-1} - y_{t-2}) + (\epsilon_t - \epsilon_{t-1}). \] The error term and the regressor are correlated. OLS \(\hat{\beta}\) is inconsistent for the original equation.

Estimation with differenced data (continue)

\[ \hat{\beta} = \frac{ \sum \Delta y_{t-1} \Delta y_t }{ \sum (\Delta y_{t-1})^2 } = \frac{ T^{-1} \sum \epsilon_{t-1} \epsilon_t }{ T^{-1} \sum \epsilon^2_{t-1}} \stackrel{p}{\to} 0, \]

instead of the true value \(1\).

Random walk with drift

Again, consider \((y_1,y_2,\ldots, y_T)\) with the initial value \(y_0 = 0\).

Random walk with drift

The data generating process is

n <- 100
x <- 1 + rnorm(n) # mu = 1, sigma = 1
y <- cumsum(x)
plot(y, type = "l")

DF test with drift

\[ \Delta y_t = \mu + \gamma y_{t-1} + \epsilon_t \]

print( summary(ur.df(y, type = "drift", lags = 0) ) ) 
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.37681 -0.81493  0.00268  0.79760  2.80549 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.9748559  0.2104664   4.632 1.13e-05 ***
## z.lag.1     0.0006186  0.0037239   0.166    0.868    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.067 on 97 degrees of freedom
## Multiple R-squared:  0.0002844,  Adjusted R-squared:  -0.01002 
## F-statistic: 0.0276 on 1 and 97 DF,  p-value: 0.8684
## 
## 
## Value of test-statistic is: 0.1661 43.9129 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.51 -2.89 -2.58
## phi1  6.70  4.71  3.86

Random walk with drift and trend

\[ y = \mu + \delta t + \beta y_{t-1} + \epsilon_t \]

\[ \begin{align} y_t & = \mu t + \delta (1+2+\cdots+t) + \epsilon_1 + \cdots + \epsilon_t \\ & = \mu t + \frac{\delta }{2} t(t+1) + \epsilon_1 + \cdots + \epsilon_t \end{align} \]

Numerical example

n <- 100
x <- 0.2 + 0.05*(1:n) + rnorm(n) # mu = 1, sigma = 1
y <- cumsum(x)
plot(y, type = "l")

DF test with drift and trend

print( summary(ur.df(y, type = "trend", lags = 0) ) ) 
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.43668 -0.78537  0.02051  0.69073  2.61615 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  0.061664   0.340928   0.181  0.85685   
## z.lag.1     -0.000671   0.005963  -0.113  0.91064   
## tt           0.054053   0.016364   3.303  0.00134 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.073 on 96 degrees of freedom
## Multiple R-squared:  0.6663, Adjusted R-squared:  0.6593 
## F-statistic: 95.83 on 2 and 96 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -0.1125 273.3224 95.8253 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

Specifications of DF tests

Augmented DF test

Example

y <- arima.sim( model = list(order = c(3,1,1), ar = c(0.4, 0.2, 0.2), ma = 0.5), n = 1000 )
df <- ur.df(y, type = "trend", lags = 10, selectlags="AIC" )
print(summary( df ) )
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.94030 -0.65582  0.00723  0.66567  2.87812 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0313938  0.0661852  -0.474   0.6354    
## z.lag.1     -0.0019963  0.0007750  -2.576   0.0101 *  
## tt          -0.0006811  0.0003068  -2.220   0.0266 *  
## z.diff.lag1  0.9134427  0.0315704  28.933  < 2e-16 ***
## z.diff.lag2 -0.2603535  0.0415695  -6.263 5.63e-10 ***
## z.diff.lag3  0.3436490  0.0414980   8.281 3.96e-16 ***
## z.diff.lag4 -0.1292930  0.0316362  -4.087 4.73e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9887 on 983 degrees of freedom
## Multiple R-squared:  0.7445, Adjusted R-squared:  0.7429 
## F-statistic: 477.3 on 6 and 983 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -2.5758 2.894 3.4493 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

Phillips-Perron test

Long-run variance

\[ \begin{align} var[\frac{1}{\sqrt{3}} (X_1 + X_2 + X_3)] & = \frac{1}{3} E[ (X_1 + X_2 + X_3)^2] \\ & = \frac{1}{3} E[ X_1^2 + X_2^2 + X_3^2 + 2 X_1 X_2 + 2 X_2 X_3 + 2 X_1 X_3] \\ & = \gamma_0 + 2( \frac{2}{3} \gamma_1 + \frac{1}{3} \gamma_2) \end{align} \]

\[ \begin{align} var[\frac{1}{\sqrt{T}} \sum_{t=1}^T X_t ] & = \frac{1}{T} E[ (\sum_{t=1}^T X_t) ^2] \\ & = \frac{1}{T} E[ \sum_{t=1}^T X_t^2 + 2 \sum_{t=1}^T \sum_{j > 1}^{T-j} X_t X_{t+j} ] \\ & = \gamma_0 + 2 \sum_{j=1}^{T-1} \left(1 - \frac{j}{T} \right) \gamma_j \end{align} \]

\[ \begin{align} var\left[ \lim_{T\to \infty }\frac{1}{\sqrt{T}} \sum_{t=1}^T X_t \right] & = \gamma_0 + 2 \sum_{j=1}^{\infty} \gamma_j \end{align} \]

Long-run variance (continue)

What happens to OLS

\[ \sqrt{T}(\hat{\beta} - \beta_0) = \sqrt{T} \times \frac{ \sum (x_t - \bar{x}) \epsilon_t}{ \sum (x_t - \bar{x})^2} = \frac{ T^{-1/2} \sum (x_t - \bar{x}) \epsilon_t}{ T^{-1} \sum (x_t - \bar{x})^2} \]

\[ \frac{lrvar[x_t \epsilon_t]} {(var[x_t])^2}, \]

instead of the familiar form under the Gauss-Markov theorem:

\[ var[x_t \epsilon_t]/ (var[x_t])^2 = var[x_t ] var[\epsilon_t]/ (var[x_t])^2=var[\epsilon_t]/var[x_t ] \]

Estimation of lrvar

\[ \hat{f}_0 = \hat{\gamma}_0 + 2 \sum_{j=1}^p w_{pj} \hat{\gamma}_j \]

where \(p\) is the number of lags. Asymptotically, \(p/T \to 0\) for consistency.

Phillips-Perron test (continue)

Phillips-Perron test statistic is a modified version of the \(t\)-statistic (The formula on the text book Eq.(5.26) has typos)

\[ t_{pp} = t_{\gamma} \sqrt{ \frac{\hat{\gamma}_0}{\hat{f}_0} } - \frac{ T\cdot (\hat{f}_0 - \hat{\gamma}_0) \cdot SE(\hat{\gamma}) }{2 \hat{f}_0 \hat{\gamma}_0 } \]

where \(t_{\gamma}\) is the OLS estimator of the \(\gamma\) coefficient, \(SE(\hat{\gamma})\) is the standard deviation from OLS, \(\hat{\gamma}_0\) is a consistent estimator of the OLS regression residual and \[ \hat{f}_0 = \hat{\gamma}_0 + 2 \sum_{j=1}^p (1-j/p) \hat{\gamma}_j \] is an consistent estimator of the lrvar of the OLS regression residual.

Example of PP-test

pp <- ur.pp(y, type="Z-tau", model="trend", lags="short")
summary(pp)
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept and trend 
## 
## 
## Call:
## lm(formula = y ~ y.l1 + trend)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.6491 -1.0720  0.1126  1.2590  5.4567 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.2882585  0.2563321  -1.125    0.261    
## y.l1         0.9999885  0.0015016 665.935   <2e-16 ***
## trend        0.0002709  0.0005921   0.458    0.647    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.946 on 997 degrees of freedom
## Multiple R-squared:  0.9997, Adjusted R-squared:  0.9997 
## F-statistic: 1.708e+06 on 2 and 997 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-tau  is: -1.4629 
## 
##            aux. Z statistics
## Z-tau-mu              0.9429
## Z-tau-beta           -1.1722
## 
## Critical values for Z statistics: 
##                    1pct      5pct     10pct
## critical values -3.9722 -3.416657 -3.130326

Stationarity as the null hypothesis

\[ y_t = \mu + \delta t + w_t + \epsilon_t, \]

in which \(w_t = w_{t-1} + v_t\) for some \(v_t \sim \mathrm{iid} (0, \sigma^2_v)\).

KPSS

\[ KPSS = \frac{1}{T^2 \hat{f}_0} \sum_{t=1}^T ( \sum_{j=1}^t \hat{\epsilon}_j)^2 \]

ur.kpss(y, type = "tau",  lags = "short") %>% summary()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: tau with 7 lags. 
## 
## Value of test-statistic is: 0.4653 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.119 0.146  0.176 0.216

Asset price bubbles (promotional)

Implementation

library(psymonitor)

SPX.2020 <- quantmod::getSymbols("^GSPC",auto.assign = FALSE, from = "2020-01-01")$GSPC.Close
lSPX.2020 <- log(SPX.2020)

psy.stat <- PSY(lSPX.2020)
plot(psy.stat, type = "l")

Summary