Create Autoregressive Models

These examples show how to create various autoregressive (AR) models by using the `arima` function.

Default AR Model

This example shows how to use the shorthand `arima(p,D,q)` syntax to specify the default AR($p$) model,

`${y}_{t}=c+{\varphi }_{1}{y}_{t-1}+\dots +{\varphi }_{p}{y}_{t-p}+{\epsilon }_{t}.$`

By default, all parameters in the created model object have unknown values, and the innovation distribution is Gaussian with constant variance.

Specify the default AR(2) model:

`Mdl = arima(2,0,0)`
```Mdl = arima with properties: Description: "ARIMA(2,0,0) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 2 D: 0 Q: 0 Constant: NaN AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

The output shows that the created model object, `Mdl`, has `NaN` values for all model parameters: the constant term, the AR coefficients, and the variance. You can modify the created model object using dot notation, or input it (along with data) to `estimate`.

AR Model with No Constant Term

This example shows how to specify an AR(p) model with constant term equal to zero. Use name-value syntax to specify a model that differs from the default model.

Specify an AR(2) model with no constant term,

`${y}_{t}={\varphi }_{1}{y}_{t-1}+{\varphi }_{2}{y}_{t-2}+{\epsilon }_{t},$`

where the innovation distribution is Gaussian with constant variance.

`Mdl = arima('ARLags',1:2,'Constant',0)`
```Mdl = arima with properties: Description: "ARIMA(2,0,0) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 2 D: 0 Q: 0 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

The `ARLags` name-value argument specifies the lags corresponding to nonzero AR coefficients. The property `Constant` in the created model object is equal to `0`, as specified. The model object has default values for all other properties, including `NaN` values as placeholders for the unknown parameters: the AR coefficients and scalar variance.

You can modify the created model object using dot notation, or input it (along with data) to `estimate`.

AR Model with Nonconsecutive Lags

This example shows how to specify an AR(p) model with nonzero coefficients at nonconsecutive lags.

Specify an AR(4) model with nonzero AR coefficients at lags 1 and 4 (and no constant term),

`${y}_{t}=0.2+0.8{y}_{t-1}-0.1{y}_{t-4}+{\epsilon }_{t},$`

where the innovation distribution is Gaussian with constant variance.

`Mdl = arima('ARLags',[1,4],'Constant',0)`
```Mdl = arima with properties: Description: "ARIMA(4,0,0) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 4 D: 0 Q: 0 Constant: 0 AR: {NaN NaN} at lags [1 4] SAR: {} MA: {} SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

The output shows the nonzero AR coefficients at lags 1 and 4, as specified. The property `P` is equal to `4`, the number of presample observations needed to initialize the AR model. The unconstrained parameters are equal to `NaN`.

Display the value of `AR`:

`Mdl.AR`
```ans=1×4 cell array {[NaN]} {[0]} {[0]} {[NaN]} ```

The `AR` cell array returns four elements. The first and last elements (corresponding to lags 1 and 4) have value `NaN`, indicating these coefficients are nonzero and need to be estimated or otherwise specified by the user. `arima` sets the coefficients at interim lags equal to zero to maintain consistency with MATLAB® cell array indexing.

ARMA Model with Known Parameter Values

This example shows how to specify an ARMA(p, q) model with known parameter values. You can use such a fully specified model as an input to `simulate` or `forecast`.

Specify the ARMA(1,1) model

`${y}_{t}=0.3+0.7\varphi {y}_{t-1}+{\epsilon }_{t}+0.4{\epsilon }_{t-1},$`

where the innovation distribution is Student's t with 8 degrees of freedom, and constant variance 0.15.

```tdist = struct('Name','t','DoF',8); Mdl = arima('Constant',0.3,'AR',0.7,'MA',0.4,... 'Distribution',tdist,'Variance',0.15)```
```Mdl = arima with properties: Description: "ARIMA(1,0,1) Model (t Distribution)" Distribution: Name = "t", DoF = 8 P: 1 D: 0 Q: 1 Constant: 0.3 AR: {0.7} at lag [1] SAR: {} MA: {0.4} at lag [1] SMA: {} Seasonality: 0 Beta: [1×0] Variance: 0.15 ```

All parameter values are specified, that is, no object property is `NaN`-valued.

AR Model with t Innovation Distribution

This example shows how to specify an AR($p$) model with a Student's t innovation distribution.

Specify an AR(2) model with no constant term,

`${y}_{t}={\varphi }_{1}{y}_{t-1}+{\varphi }_{2}{y}_{t-2}+{\epsilon }_{t},$`

where the innovations follow a Student's t distribution with unknown degrees of freedom.

`Mdl = arima('Constant',0,'ARLags',1:2,'Distribution','t')`
```Mdl = arima with properties: Description: "ARIMA(2,0,0) Model (t Distribution)" Distribution: Name = "t", DoF = NaN P: 2 D: 0 Q: 0 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

The value of `Distribution` is a `struct` array with field `Name` equal to `'t'` and field `DoF` equal to `NaN`. The `NaN` value indicates the degrees of freedom are unknown, and need to be estimated using `estimate` or otherwise specified by the user.

Specify AR Model Using Econometric Modeler App

In the Econometric Modeler app, you can specify the lag structure, presence of a constant, and innovation distribution of an AR(p) model by following these steps. All specified coefficients are unknown, estimable parameters.

1. At the command line, open the Econometric Modeler app.

`econometricModeler`

Alternatively, open the app from the apps gallery (see Econometric Modeler).

2. In the Time Series pane, select the response time series to which the model will be fit.

3. On the Econometric Modeler tab, in the Models section, click .

The AR Model Parameters dialog box appears.

4. Specify the lag structure. To specify an AR(p) model that includes all AR lags from 1 through p, use the Lag Order tab. For the flexibility to specify the inclusion of particular lags, use the Lag Vector tab. For more details, see Specifying Univariate Lag Operator Polynomials Interactively. Regardless of the tab you use, you can verify the model form by inspecting the equation in the Model Equation section.

For example:

• To specify an AR(2) model that includes a constant, includes the first lag, and has a Gaussian innovation distribution, set Autoregressive Order to `2`.

• To specify an AR(2) model that includes the first lag, has a Gaussian distribution, but does not include a constant:

1. Set Autoregressive Order to `2`.

2. Clear the Include Constant Term check box.

• To specify an AR(4) model containing nonconsecutive lags

`${y}_{t}={\varphi }_{1}{y}_{t-1}+{\varphi }_{4}{y}_{t-4}+{\epsilon }_{t},$`

where εt is a series of IID Gaussian innovations:

1. Click the Lag Vector tab.

2. Set Autoregressive Lags to `1 4`.

3. Clear the Include Constant Term check box.

• To specify an AR(2) model that includes the first lag, includes a constant term, and has t-distributed innovations:

1. Set Autoregressive Lags to `2`.

2. Click the button, then select `t`.

The degrees of freedom parameter of the t distribution is an unknown but estimable parameter.

After you specify a model, click to estimate all unknown parameters in the model.

What Are Autoregressive Models?

AR(p) Model

Many observed time series exhibit serial autocorrelation; that is, linear association between lagged observations. This suggests past observations might predict current observations. The autoregressive (AR) process models the conditional mean of yt as a function of past observations, ${y}_{t-1},{y}_{t-2},\dots ,{y}_{t-p}$. An AR process that depends on p past observations is called an AR model of degree p, denoted by AR(p).

The form of the AR(p) model in Econometrics Toolbox™ is

 ${y}_{t}=c+{\varphi }_{1}{y}_{t-1}+\dots +{\varphi }_{p}{y}_{t-p}+{\epsilon }_{t},$ (1)
where ${\epsilon }_{t}$ is an uncorrelated innovation process with mean zero.

In lag operator polynomial notation, ${L}^{i}{y}_{t}={y}_{t-i}$. Define the degree p AR lag operator polynomial $\varphi \left(L\right)=\left(1-{\varphi }_{1}L-\dots -{\varphi }_{p}{L}^{p}\right)$ . You can write the AR(p) model as

 $\varphi \left(L\right){y}_{t}=c+{\epsilon }_{t}.$ (2)
The signs of the coefficients in the AR lag operator polynomial, $\varphi \left(L\right)$, are opposite to the right side of Equation 1. When specifying and interpreting AR coefficients in Econometrics Toolbox, use the form in Equation 1.

Stationarity of the AR Model

Consider the AR(p) model in lag operator notation,

`$\varphi \left(L\right){y}_{t}=c+{\epsilon }_{t}.$`

From this expression, you can see that

 ${y}_{t}=\mu +{\varphi }^{-1}\left(L\right){\epsilon }_{t}=\mu +\psi \left(L\right){\epsilon }_{t},$ (3)
where

`$\mu =\frac{c}{\left(1-{\varphi }_{1}-\dots -{\varphi }_{p}\right)}$`

is the unconditional mean of the process, and $\psi \left(L\right)$ is an infinite-degree lag operator polynomial, $\left(1+{\psi }_{1}L+{\psi }_{2}{L}^{2}+\dots \right)$.

Note

The `Constant` property of an `arima` model object corresponds to c, and not the unconditional mean μ.

By Wold’s decomposition [2], Equation 3 corresponds to a stationary stochastic process provided the coefficients ${\psi }_{i}$ are absolutely summable. This is the case when the AR polynomial, $\varphi \left(L\right)$, is stable, meaning all its roots lie outside the unit circle.

Econometrics Toolbox enforces stability of the AR polynomial. When you specify an AR model using `arima`, you get an error if you enter coefficients that do not correspond to a stable polynomial. Similarly, `estimate` imposes stationarity constraints during estimation.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, Herman. "A Study in the Analysis of Stationary Time Series." Journal of the Institute of Actuaries 70 (March 1939): 113–115. https://doi.org/10.1017/S0020268100011574.