ARIMA

ARIMA#

ARIMA is an industry-standard name for a group of linear models that explains the time series based only on previous time series values. It takes its name from the:

AutoRegression.
Integrated.
Mean average.

ARMA#

These model interpret observed time series as a sum of two components:

\(x_t\): determined part of the \(t\)-th element of time series.
\(\varepsilon_t\): random noise of the \(t\)-th element of time series.

Thus, the observed value in the sample is actually composed of \(y_{t} = x_t + \varepsilon_t\).

The ARMA model assumes that the \(t\)-th value of the timeseries (\(x_t\)) depends linearly on \(p\) previous values of the time series (\(x_{t-1}, x_{t-2}, \ldots, x_{t-p}\)) and \(q\) previous values of the random noise (\(\varepsilon_{t-1} + \varepsilon_{t-2} + \ldots + \varepsilon_{t-q}\)).

It’s typically can be written down as a equation:

\[X_t + \alpha_1 X_{t-1} + \alpha_2 X_{t-2} + \ldots + \alpha_{t-p} X_p = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \ldots + \theta_q \varepsilon_{t-q}.\]

Where

\(\alpha_i, i = \overline{1,p}\): The coefficient that describes how the \(t\)-th value of the time series depends on the \(t-i\) value of the time series.
\(\theta_i, i = \overline{1,q}\): The coefficient that describes how the \(t\)-th value of the time series depends on the random noise for the \(t-i\)-th observation.

The official definition can be a bit confusing because it does not express the paticular value of the time series. Thus, it can be rewritten using basic mathematical transformations as follows:

\[X_t - \varepsilon_t = \alpha_1 X_{t-1} + \alpha_2 X_{t-2} + \ldots + \alpha_{t-p} X_p - \theta_1 \varepsilon_{t-1} - \theta_2 \varepsilon_{t-2} - \ldots - \theta_q \varepsilon_{t-q}\]

Since an \(\varepsilon_t\) is just a noise, the sign before it is not imporant. We can rewrite the entire identity as follows:

\[X_t + \varepsilon_t = \alpha_1 X_{t-1} + \alpha_2 X_{t-2} + \ldots + \alpha_{t-p} X_p + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \ldots + \theta_q \varepsilon_{t-q}\]

The ARMA model can only be applied only under the assumption that the time series is stationary. This means that there are no trends and the variation is constant.

Compution#

Because the ARMA model uses the previous \(\max(p, q)\) values to estimate the \(t\)-th element of the sequence, the procedure is recursive.

This creates the issue that we need initial values to start the process. Typically, the missing \(x_{t-p}, \ldots, x_{-1}\) and \(\varepsilon_{t-q}, \ldots, \varepsilon_{t-q}\) values for the first \(\max(p, q)\) elements are either set to constants or estimated using backcasting.

Backcasting is a method of estimating pre-sample values of a time series by running the model equations backward in time to generate plausible initial conditions.

Integration#

The ARMA model requires the explained series to be stationary. The integration procedure is typically applied achieve stationarity. In this context, integration is simply a transformation that subtracts the previous value of the time series from the current value:

\[\nabla x_t = x_t - x_{t-1}\]

Generally, integrations can be applied several times:

\[\nabla^d x_t = \nabla^{d-1}x_t - \nabla^{d-1}x_{t-1}\]

The \(ARIMA(p, d, q)\) stands for the applying the ARMA model to the \(d\)-diffirintiated \(\nabla^d x_t\).

To determine how many times a time series must be integrated, the series is typcally integrated until a selected stationarity test indicates that it is stationary. Following tests can be applied in this case:

ADF: Augmented Dickey-Fuller (ADF) test.
KPSS: Kwiatkowski-Phillips-Schmidt-Shin test.
PP: Phillips-Perron test.

Choosing the order#

There are a number of approaches that allows to determine which autoregressive and mean-average components should be included in the model.

If the first \(p\) partial autocorrelation coefficients are high and then there is a rapid cutoff, it may indicate that you should include the first \(p\) autocorrelation coefficients.
If the first \(q\) autocorrelation coefficients are high and then there is a rapid cutoff, it may indicate that you should include the first \(q\) mean average coefficients.