Vector Autoregression (VAR) Models

A vector autoregression (VAR) model is a multivariate time series model containing a system of n equations of n distinct, stationary response variables as linear functions of lagged responses and other terms. VAR models are also characterized by their degree p; each equation in a VAR(p) model contains p lags of all variables in the system.

VAR models belong to a class of multivariate linear time series models called vector autoregression moving average (VARMA) models. Although Econometrics Toolbox™ provides functionality to conduct a comprehensive analysis of a VAR(p) model (from model estimation to forecasting and simulation), the toolbox provides limited support for other models in the VARMA class.

In general, multivariate linear time series models are well suited for:

Modeling the movements of several stationary time series simultaneously.
Measuring the delayed effects among the response variables in the system.
Measuring the effects of exogenous series on variables in the system. For example, determine whether the presence of a recently imposed tariff significantly affects several econometric series.
Generating simultaneous forecasts of the response variables.

Types of Stationary Multivariate Time Series Models

This table contains forms of multivariate linear time series models and describes their supported functionality in Econometrics Toolbox.

Model	Abbreviation	Equation	Supported Functionality
Vector autoregression	VAR(p)	$y_{t} = c + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + ε_{t}$	Represent the model by using a `varm` object: Create a template for estimation or a fully specified model by using `varm`. Estimate any unknown parameters by using `estimate`. Work with a fully specified model by applying object functions. Obtain the coefficient matrices of a VAR model from the coefficient matrices of its VARMA(p,q) equivalent by using `arma2ar`. Given coefficient matrices, perform dynamic multiplier analysis by using `armairf` and `armafevd`.
Vector autoregression with a linear time trend	VAR(p)	$y_{t} = c + δ t + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + ε_{t}$	Represent the model by using a `varm` object. `estimate` and all other object functions support this model.
Vector autoregression with exogenous series	VARX(p)	$y_{t} = c + δ t + β x_{t} + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + ε_{t}$	Represent the model by using a `varm` object. `estimate` and all other object functions support this model.
Vector moving average	VMA(q)	$y_{t} = c + \sum_{k = 1}^{q} Θ_{k} ε_{t - k} + ε_{t}$	Obtain coefficient matrices of a VMA model from the coefficient matrices of its VARMA(p,q) equivalent by using `arma2ma`. Given coefficient matrices, perform dynamic multiplier analysis by using `armairf` and `armafevd`.
Vector autoregression moving average	VARMA(p, q)	$y_{t} = c + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + \sum_{k = 1}^{q} Θ_{k} ε_{t - k} + ε_{t}$	Obtain coefficient matrices of a VAR or VMA model from the coefficient matrices of its VARMA(p,q) equivalent by using `arma2ar` or `arma2ma`, respectively. Given coefficient matrices, perform dynamic multiplier analysis by using `armairf` and `armafevd`.
Structural vector autoregression moving average	SVARMA(p, q)	$Φ_{0} y_{t} = c + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + \sum_{k = 1}^{q} Θ_{k} ε_{t - k} + Θ_{0} ε_{t}$	Same support as for VARMA models

The following variables appear in the equations:

y_t is the n-by-1 vector of distinct response time series variables at time t.
c is an n-by-1 vector of constant offsets in each equation.
Φ_j is an n-by-n matrix of AR coefficients, where j = 1,...,p and Φ_p is not a matrix containing only zeros.
x_t is an m-by-1 vector of values corresponding to m exogenous variables or predictors. In addition to the lagged responses, exogenous variables are unmodeled inputs to the system. Each exogenous variable appears in all response equations by default.
β is an n-by-m matrix of regression coefficients. Row j contains the coefficients in the equation of response variable j, and column k contains the coefficients of exogenous variable k among all equations.
δ is an n-by-1 vector of linear time-trend values.
ε_t is an n-by-1 vector of random Gaussian innovations, each with a mean of 0 and collectively an n-by-n covariance matrix Σ. For t ≠ s, ε_t and ε_s are independent.
Θ_k is an n-by-n matrix of MA coefficients, where k = 1,...,q and Θ_q is not a matrix containing only zeros.
Φ₀ and Θ₀ are the AR and MA structural coefficients, respectively.

Generally, the time series y_t and x_t are observable because you have data representing the series. The values of c, δ, β, and the autoregressive matrices Φ_j are not always known. You typically want to fit these parameters to your data. See estimate for ways to estimate unknown parameters or how to hold some of them fixed to values (set equality constraints) during estimation. The innovations ε_t are not observable in data, but they can be observable in simulations.

Lag Operator Representation

In the preceding table, the models are represented in difference-equation notation. Lag operator notation is an equivalent and more succinct representation of the multivariate linear time series equations.

The lag operator L reduces the time index by one unit: Ly_t = y_t–1. The operator L^j reduces the time index by j units: L^jy_t = y_t–j.

In lag operator form, the equation for a SVARMAX(p, q) model is:

$(Φ_{0} - \sum_{j = 1}^{p} Φ_{j} L^{j}) y_{t} = c + β x_{t} + (Θ_{0} + \sum_{k = 1}^{q} Θ_{k} L^{k}) ε_{t} .$

The equation is expressed more succinctly in this form:

$Φ (L) y_{t} = c + β x_{t} + Θ (L) ε_{t},$

where

$Φ (L) = Φ_{0} - \sum_{j = 1}^{p} Φ_{j} L^{j}$

and

$Θ (L) = Θ_{0} + \sum_{k = 1}^{q} Θ_{k} L^{k} .$

Stable and Invertible Models

A multivariate AR polynomial is stable if

$\det (I_{n} - Φ_{1} z - Φ_{2} z^{2} - ... - Φ_{p} z^{p}) \neq 0 for | z | \leq 1.$

With all innovations equal to zero, this condition implies that the VAR process converges to c as t approaches infinity (for more details, see [1], Ch. 2).

A multivariate MA polynomial is invertible if

$\det (I_{n} + Θ_{1} z + Θ_{2} z^{2} + ... + Θ_{q} z^{q}) \neq 0 for | z | \leq 1.$

This condition implies that the pure VAR representation of the VMA process is stable (for more details, see [1], Ch. 11).

A VARMA model is stable if its AR polynomial is stable. Similarly, a VARMA model is invertible if its MA polynomial is invertible.

Models with exogenous inputs (for example, VARMAX models) have no well-defined notion of stability or invertibility. An exogenous input can destabilize a model.

Models with Regression Component

Incorporate feedback from exogenous predictors, or study their linear associations with the response series, by including a regression component in a multivariate linear time series model. By order of increasing complexity, examples of applications that use such models:

Modeling the effects of an intervention, which implies that the exogenous series is an indicator variable.
Modeling the contemporaneous linear associations between a subset of exogenous series to each response. Applications include CAPM analysis and studying the effects of prices of items on their demand. These applications are examples of seemingly unrelated regression (SUR). For more details, see Implement Seemingly Unrelated Regression and Estimate Capital Asset Pricing Model Using SUR.
Modeling the linear associations between contemporaneous and lagged exogenous series and the response as part of a distributed lag model. Applications include determining how a change in monetary growth affects real gross domestic product (GDP) and gross national income (GNI).
Any combination of SUR and the distributed lag model that includes the lagged effects of responses, also known as simultaneous equation models.

The general equation for a VARX(p) model is

$y_{t} = c + δ t + β x_{t} + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + ε_{t}$

where

x_t is an m-by-1 vector of observations from m exogenous variables at time t. The vector x_t can contain lagged exogenous series.
β is an n-by-m vector of regression coefficients. Row j of β contains the regression coefficients in the equation of response series j for all exogenous variables. Column k of β contains the regression coefficients among the response series equations for exogenous variable k. This figure shows the system with an expanded regression component:

$[\begin{matrix} y_{1, t} \\ y_{2, t} \\ ⋮ \\ y_{n, t} \end{matrix}] = c + δ t + [\begin{matrix} x_{1, t} β (1, 1) + \dots + x_{m, t} β (1, m) \\ x_{1, t} β (2, 1) + \dots + x_{m, t} β (2, m) \\ ⋮ \\ x_{1, t} β (n, 1) + \dots + x_{m, t} β (n, m) \end{matrix}] + \sum_{j = 1}^{p} Φ_{j} y_{t - j} + ε_{t} .$

VAR Model Workflow

This workflow describes how to analyze multivariate time series by using Econometrics Toolbox VAR model functionality. If you believe the response series are cointegrated, use VEC model functionality instead (see vecm).

Load, preprocess, and partition the data set. For more details, see Format Multivariate Time Series Data.
Create a varm model object that characterizes a VAR model. A varm model object is a MATLAB^® variable containing properties that describe the model, such as AR polynomial degree p, response dimensionality n, and coefficient values. varm must be able to infer n and p from your specifications; n and p are not estimable. You can update the lag structure of the AR polynomial after creating a VAR model, but you cannot change n.
varm enables you to create these types of models:
- Fully specified model in which all parameters, including coefficients and the innovations covariance matrix, are numeric values. Create this type of model when economic theory specifies the values of all parameters in the model, or you want to experiment with parameter settings. After creating a fully specified model, you can pass the model to all object functions except estimate.
- Model template in which n and p are known values, but all coefficients and the innovations covariance matrix are unknown, estimable parameters. Properties corresponding to estimable parameters are composed of NaN values. Pass a model template and data to estimate to obtain an estimated (fully specified) VAR model. Then, you can pass the estimated model to any other object function.
- Partially specified model template in which some parameters are known, and others are unknown and estimable. If you pass a partially specified model and data to estimate, MATLAB treats the known parameter values as equality constraints during optimization, and estimates the unknown values. A partially specified model is well suited to these tasks:
  - Remove lags from the model by setting the coefficient to zero.
  - Associate a subset of predictors to a response variable by setting to zero the regression coefficients of predictors you do not want in the response equation.
For more details, see Create VAR Model.
For models with unknown, estimable parameters, fit the model to data. See VAR Model Estimation Overview and estimate.
Find an appropriate AR polynomial degree by iterating steps 2 and 3. See Select Appropriate Lag Order.
Analyze the fitted model. This step can involve:
1. Determining whether response series Granger-cause other response series in the system (see gctest).
2. Examining the stability of a fitted model (see VAR Model Estimation Overview).
3. Calculating impulse responses, which are forecasts based on an assumed change in an input to a time series.
4. VAR model forecasting by obtaining either minimum mean square error forecasts or Monte Carlo forecasts.
5. Comparing model forecasts to holdout data. For an example, see VAR Model Case Study.

Your application does not have to involve all the steps in this workflow, and you can iterate some of the steps. For example, you might not have any data, but want to simulate responses from a fully specified model.

References

[1] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.