## Specify ARIMA Error Model Innovation Distribution

### About the Innovation Process

A regression model with ARIMA errors has the following general form:

$$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ a\left(L\right)A\left(L\right){\left(1-L\right)}^{D}\left(1-{L}^{s}\right){u}_{t}=b\left(L\right)B\left(L\right){\epsilon}_{t},\end{array}$$ | (1) |

*t*= 1,...,*T*.*y*is the response series._{t}*X*is row_{t}*t*of*X*, which is the matrix of concatenated predictor data vectors. That is,*X*is observation_{t}*t*of each predictor series.*c*is the regression model intercept.*β*is the regression coefficient.*u*is the disturbance series._{t}*ε*is the innovations series._{t}$${L}^{j}{y}_{t}={y}_{t-j}.$$

$$a\left(L\right)=\left(1-{a}_{1}L-\mathrm{...}-{a}_{p}{L}^{p}\right),$$ which is the degree

*p*, nonseasonal autoregressive polynomial.$$A\left(L\right)=\left(1-{A}_{1}L-\mathrm{...}-{A}_{{p}_{s}}{L}^{{p}_{s}}\right),$$ which is the degree

*p*, seasonal autoregressive polynomial._{s}$${\left(1-L\right)}^{D},$$ which is the degree

*D*, nonseasonal integration polynomial.$$\left(1-{L}^{s}\right),$$ which is the degree

*s*, seasonal integration polynomial.$$b\left(L\right)=\left(1+{b}_{1}L+\mathrm{...}+{b}_{q}{L}^{q}\right),$$ which is the degree

*q*, nonseasonal moving average polynomial.$$B\left(L\right)=\left(1+{B}_{1}L+\mathrm{...}+{B}_{{q}_{s}}{L}^{{q}_{s}}\right),$$ which is the degree

*q*, seasonal moving average polynomial._{s}

Suppose that the unconditional disturbance series (*u _{t}*) is a stationary stochastic processes. Then, you can express the second equation in Equation 1 as

$${u}_{t}={a}^{-1}(L){A}^{-1}(L){(1-L)}^{-D}{(1-{L}^{s})}^{-1}b(L)B(L){\epsilon}_{t}=\Psi (L){\epsilon}_{t},$$

where *Ψ*(*L*) is an infinite degree lag operator polynomial [2].

The innovation process (*ε _{t}*) is an independent and identically distributed (iid), mean 0 process with a known distribution. Econometrics Toolbox™ generalizes the innovation process to

*ε*=

_{t}*σz*, where

_{t}*z*is a series of iid random variables with mean 0 and variance 1, and

_{t}*σ*

^{2}is the constant variance of

*ε*.

_{t}`regARIMA`

models contain two properties that describe the distribution of *ε _{t}*:

`Variance`

stores*σ*^{2}.`Distribution`

stores the parametric form of*z*._{t}

### Innovation Distribution Options

The default value of

`Variance`

is`NaN`

, meaning that the innovation variance is unknown. You can assign a positive scalar to`Variance`

when you specify the model using the name-value pair argument`'Variance',sigma2`

(where`sigma2`

=*σ*^{2}), or by modifying an existing model using dot notation. Alternatively, you can estimate`Variance`

using`estimate`

.You can specify the following distributions for

*z*(using name-value pair arguments or dot notation):_{t}Standard Gaussian

Standardized Student’s

*t*with degrees of freedom*ν*> 2. Specifically,$${z}_{t}={T}_{\nu}\sqrt{\frac{\nu -2}{\nu}},$$

where

*T*is a Student’s_{ν}*t*distribution with degrees of freedom*ν*> 2.

The

*t*distribution is useful for modeling innovations that are more extreme than expected under a Gaussian distribution. Such innovation processes have*excess kurtosis*, a more peaked (or heavier tailed) distribution than a Gaussian. Note that for*ν*> 4, the kurtosis (fourth central moment) of*T*is the same as the kurtosis of the Standardized Student’s_{ν}*t*(*z*), i.e., for a_{t}*t*random variable, the kurtosis is scale invariant.**Tip**It is good practice to assess the distributional properties of the residuals to determine if a Gaussian innovation distribution (the default distribution) is appropriate for your model.

### Specify Innovation Distribution

`regARIMA`

stores the distribution (and degrees of freedom for the *t* distribution) in the `Distribution`

property. The data type of `Distribution`

is a `struct`

array with potentially two fields: `Name`

and `DoF`

.

If the innovations are Gaussian, then the

`Name`

field is`Gaussian`

, and there is no`DoF`

field.`regARIMA`

sets`Distribution`

to`Gaussian`

by default.If the innovations are

*t*-distributed, then the`Name`

field is`t`

and the`DoF`

field is`NaN`

by default, or you can specify a scalar that is greater than 2.

To illustrate specifying the distribution, consider this regression model with AR(2) errors:

$$\begin{array}{rcl}{y}_{t}& =& c+{X}_{t}\beta +{u}_{t}\\ {u}_{t}& =& {\alpha}_{1}{u}_{t-1}+{\alpha}_{2}{u}_{t-2}+{\epsilon}_{t}\end{array}$$

Mdl = regARIMA(2,0,0); Mdl.Distribution

`ans = `*struct with fields:*
Name: "Gaussian"

By default, `Distribution`

property of `Mdl`

is a `struct`

array with the field `Name`

having the value `Gaussian`

.

If you want to specify a *t* innovation distribution, then you can either specify the model using the name-value pair argument `'Distribution','t'`

, or use dot notation to modify an existing model.

Specify the model using the name-value pair argument.

Mdl = regARIMA('ARLags',1:2,'Distribution','t'); Mdl.Distribution

`ans = `*struct with fields:*
Name: "t"
DoF: NaN

If you use the name-value pair argument to specify the *t* innovation distribution, then the default degrees of freedom is `NaN`

.

You can use dot notation to yield the same result.

```
Mdl = regARIMA(2,0,0);
Mdl.Distribution = 't'
```

Mdl = regARIMA with properties: Description: "ARMA(2,0) Error Model (t Distribution)" Distribution: Name = "t", DoF = NaN Intercept: NaN Beta: [1×0] P: 2 Q: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Variance: NaN

If the innovation distribution is $${t}_{10}$$, then you can use dot notation to modify the `Distribution`

property of the existing model `Mdl`

. You cannot modify the fields of `Distribution`

using dot notation, e.g., `Mdl.Distribution.DoF = 10`

is not a value assignment. However, you can display the value of the fields using dot notation.

Mdl.Distribution = struct('Name','t','DoF',10)

Mdl = regARIMA with properties: Description: "ARMA(2,0) Error Model (t Distribution)" Distribution: Name = "t", DoF = 10 Intercept: NaN Beta: [1×0] P: 2 Q: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Variance: NaN

tDistributionDoF = Mdl.Distribution.DoF

tDistributionDoF = 10

Since the `DoF`

field is not a `NaN`

, it is an equality constraint when you estimate `Mdl`

using `estimate`

.

Alternatively, you can specify the $${t}_{10}$$ innovation distribution using the name-value pair argument.

Mdl = regARIMA('ARLags',1:2,'Intercept',0,... 'Distribution',struct('Name','t','DoF',10))

Mdl = regARIMA with properties: Description: "ARMA(2,0) Error Model (t Distribution)" Distribution: Name = "t", DoF = 10 Intercept: 0 Beta: [1×0] P: 2 Q: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {} SMA: {} Variance: NaN

## References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. *Time Series Analysis: Forecasting and Control*. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, H. *A Study in the Analysis of Stationary Time Series*. Uppsala, Sweden: Almqvist & Wiksell, 1938.

## See Also

### Apps

### Objects

### Functions

## Related Examples

- Analyze Time Series Data Using Econometric Modeler
- Create Regression Models with ARIMA Errors
- Specify the Default Regression Model with ARIMA Errors
- Create Regression Models with AR Errors
- Create Regression Models with MA Errors
- Create Regression Models with ARMA Errors
- Create Regression Models with SARIMA Errors