Univariate forecasting

Introduction

We can model the process as:

\(y_t=\mu_t +f(t)+\epsilon_t\)

Identifying the order of integration using Augmented Dickey-Fuller

The Dickey-Fuller test with deterministic time trend was:

\(\Delta y_t=\alpha + \beta t + \gamma y_{t-1}+\epsilon_t\)

The Augmented Dickey-Fuller model adds lags for the differences.

\(\Delta y_t=\alpha + \beta t + \gamma y_{t-1}+\sum_i^p \delta_i \Delta y_{t-i} + \epsilon_t\)

Cyclical fluctuations

We can have shocks having effects over time.

This is separate to trends.

Identifying serial correlation using the Durbin-Watson statistic

Introduction to forecasting

We observe a series of observations:

\(x_1, x_2,...,x_t)\)

What can we say about \(x_{t+1}\)?

If the data was drawn iid then the past data then we would just want to identify moments.

However if the data is not iid, for example because it is increasing in time, then this is not the best way.

Regression formation

We can model

\(x_t=\alpha + \epsilon_t\)

Autoregressive model

Autoregressive models (AR)

AR(\(1\))

Our basic model was:

\(x_t=\alpha + \epsilon_t\)

We add an autoregressive component by adding a lagged observation.

\(x_t=\alpha + \beta x_{t-1}+\epsilon_t\)

AR(\(p\))

AR(\(p\)) has \(p\) previous dependent variables.

\(x_t=\alpha + \sum_{i=1}^p\beta_ix_{t-i}\)

Propagation of shocks

A shock bumps up the output variable, which bumps up output variables forever, at a decreasing rate.

Testing for stationarity with Dickey-Fuller (DF) and Augmented Dicky-Fuller (ADF)

Stationarity

Unit roots

Integration order

Dickey-Fuller

The Dickey-Fuller test tests if there is a unit root.

The AR(\(1\)) model is:

\(y_t=\alpha + \beta y_{t-1}+\epsilon_t\)

We can rewrite this as:

\(\Delta y_t=\alpha + (\beta -1)y_{t-1}+\epsilon_t\)

We test if \(\beta -1)=0\).

If the coefficient on the last term is \(1\) we have a random walk, and the process is non-stationary.

If the last term is \(<1\) then we have a stationary process.

Variation: Removing the drift

If our model has no intercept it is:

\(y_t=\beta y_{t-1}+\epsilon_t\)

\(\Delta y_t=(\beta -1)y_{t-1}+\epsilon_t\)

Variation: Adding a deterministic trend

If our model has a time trend it is:

\(y_t=\alpha \beta y_{t-1}+\gamma t + \epsilon_t\)

\(\Delta y_t=\alpha + (\beta -1)y_{t-1}+\gamma t+\epsilon_t\)

Augmented Dickey-Fuller

We include more lagged variables.

\(y_t=\alpha + \beta t + \sum_i^p \theta_i y_{t-i}+\epsilon_t\)

If no unit root, can do normal OLS?

Autoregressive Conditional Heteroskedasticity (ARCH)

Variance of the AR(\(1\)) model

The standard AR(\(1\)) model is:

\(y_t=\alpha + \beta y_{t-1}+\epsilon_t\)

The variance is:

\(Var(y_t)=Var(\alpha + \beta y_{t-1}+\epsilon_t)\)

\(Var(y_t)(1-\beta^2)=Var(\epsilon_t)\)

Assuming the errors are IID we have:

\(Var(y_t))=\dfrac{\sigma^2 }{1-\beta^2 }\)

This is independent of historic observations, which may not be desirable.

Conditional variance

Consider the alternative formulation:

\(y_t=\epsilon_t f(y_{t-1})\)

This allows for conditional heteroskedasticity.

Moving average models

Moving Average models (MA)

We add previous error terms as input variables

MA(\(q\)) has \(q\) previous error terms in the model

Unlike AR models, the effects of any shocks wear off after \(q\) terms.

This is harder to fit the OLS, the error terms themselves are not observed.

Autoregressive Moving Average models

Autoregressive Moving Average models (ARMA)

We include both AR and MA

Estimted using Box-Jenkins

Autoregressive Integrated Moving Average models (ARIMA)

Uses differences to remove non statiority

Also estiamted with box-jenkins

Seasonal ARIMA

Forecasting

Monte carlo simulations

N-step ahead

Consensus forecasting

Other

Identifying the order of integration using Augmented Dickey-Fuller

The Dickey-Fuller test with deterministic time trend was:

\(\Delta y_t=\alpha + \beta t + \gamma y_{t-1}+\epsilon_t\)

The Augmented Dickey-Fuller model adds lags for the differences.

\(\Delta y_t=\alpha + \beta t + \gamma y_{t-1}+\sum_i^p \delta_i \Delta y_{t-i} + \epsilon_t\)

Identifying serial correlation using the Durbin-Watson statistic