Create Autoregressive Models

These examples show how to create various autoregressive (AR) models by using the arima function.

Default AR Model

This example shows how to use the shorthand arima(p,D,q) syntax to specify the default AR( $p$ ) model,

$y_{t} = c + ϕ_{1} y_{t - 1} + \dots + ϕ_{p} y_{t - p} + ε_{t} .$

By default, all parameters in the created model object have unknown values, and the innovation distribution is Gaussian with constant variance.

Specify the default AR(2) model:

Mdl = arima(2,0,0)

Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 0
        Constant: NaN
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The output shows that the created model object, Mdl, has NaN values for all model parameters: the constant term, the AR coefficients, and the variance. You can modify the created model object using dot notation, or input it (along with data) to estimate.

AR Model with No Constant Term

Open Live Script

This example shows how to specify an AR(p) model with constant term equal to zero. Use name-value syntax to specify a model that differs from the default model.

Specify an AR(2) model with no constant term,

$y_{t} = ϕ_{1} y_{t - 1} + ϕ_{2} y_{t - 2} + ε_{t},$

where the innovation distribution is Gaussian with constant variance.

Mdl = arima('ARLags',1:2,'Constant',0)

Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 0
        Constant: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The ARLags name-value argument specifies the lags corresponding to nonzero AR coefficients. The property Constant in the created model object is equal to 0, as specified. The model object has default values for all other properties, including NaN values as placeholders for the unknown parameters: the AR coefficients and scalar variance.

You can modify the created model object using dot notation, or input it (along with data) to estimate.

AR Model with Nonconsecutive Lags

Open Live Script

This example shows how to specify an AR(p) model with nonzero coefficients at nonconsecutive lags.

Specify an AR(4) model with nonzero AR coefficients at lags 1 and 4 (and no constant term),

$y_{t} = 0.2 + 0.8 y_{t - 1} - 0.1 y_{t - 4} + ε_{t},$

where the innovation distribution is Gaussian with constant variance.

Mdl = arima('ARLags',[1,4],'Constant',0)

Mdl = 
  arima with properties:

     Description: "ARIMA(4,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 4
               D: 0
               Q: 0
        Constant: 0
              AR: {NaN NaN} at lags [1 4]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The output shows the nonzero AR coefficients at lags 1 and 4, as specified. The property P is equal to 4, the number of presample observations needed to initialize the AR model. The unconstrained parameters are equal to NaN.

Display the value of AR:

Mdl.AR

ans=1×4 cell array
    {[NaN]}    {[0]}    {[0]}    {[NaN]}

The AR cell array returns four elements. The first and last elements (corresponding to lags 1 and 4) have value NaN, indicating these coefficients are nonzero and need to be estimated or otherwise specified by the user. arima sets the coefficients at interim lags equal to zero to maintain consistency with MATLAB® cell array indexing.

ARMA Model with Known Parameter Values

Open Live Script

This example shows how to specify an ARMA(p, q) model with known parameter values. You can use such a fully specified model as an input to simulate or forecast.

Specify the ARMA(1,1) model

$y_{t} = 0.3 + 0.7 ϕ y_{t - 1} + ε_{t} + 0.4 ε_{t - 1},$

where the innovation distribution is Student's t with 8 degrees of freedom, and constant variance 0.15.

tdist = struct('Name','t','DoF',8);
Mdl = arima('Constant',0.3,'AR',0.7,'MA',0.4,...
    'Distribution',tdist,'Variance',0.15)

Mdl = 
  arima with properties:

     Description: "ARIMA(1,0,1) Model (t Distribution)"
      SeriesName: "Y"
    Distribution: Name = "t", DoF = 8
               P: 1
               D: 0
               Q: 1
        Constant: 0.3
              AR: {0.7} at lag [1]
             SAR: {}
              MA: {0.4} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.15

All parameter values are specified, that is, no object property is NaN-valued.

AR Model with t Innovation Distribution

Open Live Script

This example shows how to specify an AR( $p$ ) model with a Student's t innovation distribution.

Specify an AR(2) model with no constant term,

$y_{t} = ϕ_{1} y_{t - 1} + ϕ_{2} y_{t - 2} + ε_{t},$

where the innovations follow a Student's t distribution with unknown degrees of freedom.

Mdl = arima('Constant',0,'ARLags',1:2,'Distribution','t')

Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (t Distribution)"
      SeriesName: "Y"
    Distribution: Name = "t", DoF = NaN
               P: 2
               D: 0
               Q: 0
        Constant: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: NaN

The value of Distribution is a struct array with field Name equal to 't' and field DoF equal to NaN. The NaN value indicates the degrees of freedom are unknown, and need to be estimated using estimate or otherwise specified by the user.

Specify AR Model Using Econometric Modeler App

In the Econometric Modeler app, you can specify the lag structure, presence of a constant, and innovation distribution of an AR(p) model by following these steps. All specified coefficients are unknown, estimable parameters.

At the command line, open the Econometric Modeler app.
```
econometricModeler
```
Alternatively, open the app from the apps gallery (see Econometric Modeler).
In the Time Series pane, select the response time series to which the model will be fit.
On the Econometric Modeler tab, in the Models section, click AR.
The AR Model Parameters dialog box appears.
Specify the lag structure. To specify an AR(p) model that includes all AR lags from 1 through p, use the Lag Order tab. For the flexibility to specify the inclusion of particular lags, use the Lag Vector tab. For more details, see Specifying Univariate Lag Operator Polynomials Interactively. Regardless of the tab you use, you can verify the model form by inspecting the equation in the Model Equation section.

For example:

To specify an AR(2) model that includes a constant, includes the first lag, and has a Gaussian innovation distribution, set Autoregressive Order to 2.
To specify an AR(2) model that includes the first lag, has a Gaussian distribution, but does not include a constant:
1. Set Autoregressive Order to 2.
2. Clear the Include Constant Term check box.
To specify an AR(4) model containing nonconsecutive lags
$y_{t} = ϕ_{1} y_{t - 1} + ϕ_{4} y_{t - 4} + ε_{t},$
where ε_t is a series of IID Gaussian innovations:
1. Click the Lag Vector tab.
2. Set Autoregressive Lags to 1 4.
3. Clear the Include Constant Term check box.
To specify an AR(2) model that includes the first lag, includes a constant term, and has t-distributed innovations:
1. Set Autoregressive Lags to 2.
2. Click the Innovation Distribution button, then select t.
The degrees of freedom parameter of the t distribution is an unknown but estimable parameter.

After you specify a model, click Estimate to estimate all unknown parameters in the model.

AR(p) Model

Many observed time series exhibit serial autocorrelation; that is, linear association between lagged observations. This suggests past observations might predict current observations. The autoregressive (AR) process models the conditional mean of y_t as a function of past observations, $y_{t - 1}, y_{t - 2}, \dots, y_{t - p}$ . An AR process that depends on p past observations is called an AR model of degree p, denoted by AR(p).

The form of the AR(p) model in Econometrics Toolbox™ is

y_{t} = c + ϕ_{1} y_{t - 1} + \dots + ϕ_{p} y_{t - p} + ε_{t},

(1)

where

ε_{t}

is an uncorrelated innovation process with mean zero.

In lag operator polynomial notation, $L^{i} y_{t} = y_{t - i}$ . Define the degree p AR lag operator polynomial $ϕ (L) = (1 - ϕ_{1} L - \dots - ϕ_{p} L^{p})$ . You can write the AR(p) model as

ϕ (L) y_{t} = c + ε_{t} .

(2)

The signs of the coefficients in the AR lag operator polynomial,

ϕ (L)

, are opposite to the right side of Equation 1. When specifying and interpreting AR coefficients in Econometrics Toolbox, use the form in Equation 1.

Stationarity of the AR Model

Consider the AR(p) model in lag operator notation,

$ϕ (L) y_{t} = c + ε_{t} .$

From this expression, you can see that

y_{t} = μ + ϕ^{- 1} (L) ε_{t} = μ + ψ (L) ε_{t},

(3)

where

$μ = \frac{c}{(1 - ϕ_{1} - \dots - ϕ_{p})}$

is the unconditional mean of the process, and $ψ (L)$ is an infinite-degree lag operator polynomial, $(1 + ψ_{1} L + ψ_{2} L^{2} + \dots)$ .

Note

The Constant property of an arima model object corresponds to c, and not the unconditional mean μ.

By Wold’s decomposition [2], Equation 3 corresponds to a stationary stochastic process provided the coefficients $ψ_{i}$ are absolutely summable. This is the case when the AR polynomial, $ϕ (L)$ , is stable, meaning all its roots lie outside the unit circle.

Econometrics Toolbox enforces stability of the AR polynomial. When you specify an AR model using arima, you get an error if you enter coefficients that do not correspond to a stable polynomial. Similarly, estimate imposes stationarity constraints during estimation.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, Herman. "A Study in the Analysis of Stationary Time Series." Journal of the Institute of Actuaries 70 (March 1939): 113–115. https://doi.org/10.1017/S0020268100011574.