Vector Autoregression (VAR) Models
A vector autoregression (VAR) model is a multivariate time series model containing a system of n equations of n distinct, stationary response variables as linear functions of lagged responses and other terms. VAR models are also characterized by their degree p; each equation in a VAR(p) model contains p lags of all variables in the system.
VAR models belong to a class of multivariate linear time series models called vector autoregression moving average (VARMA) models. Although Econometrics Toolbox™ provides functionality to conduct a comprehensive analysis of a VAR(p) model (from model estimation to forecasting and simulation), the toolbox provides limited support for other models in the VARMA class.
In general, multivariate linear time series models are well suited for:
Modeling the movements of several stationary time series simultaneously.
Measuring the delayed effects among the response variables in the system.
Measuring the effects of exogenous series on variables in the system. For example, determine whether the presence of a recently imposed tariff significantly affects several econometric series.
Generating simultaneous forecasts of the response variables.
Types of Stationary Multivariate Time Series Models
This table contains forms of multivariate linear time series models and describes their supported functionality in Econometrics Toolbox.
Model  Abbreviation  Equation  Supported Functionality 

Vector autoregression  VAR(p) 
$${y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$


Vector autoregression with a linear time trend  VAR(p) 
$${y}_{t}=c+\delta t+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}+{\epsilon}_{t}}$$
 Represent the model by using a 
Vector autoregression with exogenous series  VARX(p) 
$${y}_{t}=c+\delta t+\beta {x}_{t}+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$
 Represent the model by using a 
Vector moving average  VMA(q) 
$${y}_{t}=c+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\epsilon}_{t}$$
 
Vector autoregression moving average  VARMA(p, q) 
$${y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\epsilon}_{t}$$
 
Structural vector autoregression moving average  SVARMA(p, q) 
$${\Phi}_{0}{y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\Theta}_{0}{\epsilon}_{t}$$
 Same support as for VARMA models 
The following variables appear in the equations:
y_{t} is the nby1 vector of distinct response time series variables at time t.
c is an nby1 vector of constant offsets in each equation.
Φ_{j} is an nbyn matrix of AR coefficients, where j = 1,...,p and Φ_{p} is not a matrix containing only zeros.
x_{t} is an mby1 vector of values corresponding to m exogenous variables or predictors. In addition to the lagged responses, exogenous variables are unmodeled inputs to the system. Each exogenous variable appears in all response equations by default.
β is an nbym matrix of regression coefficients. Row j contains the coefficients in the equation of response variable j, and column k contains the coefficients of exogenous variable k among all equations.
δ is an nby1 vector of linear timetrend values.
ε_{t} is an nby1 vector of random Gaussian innovations, each with a mean of 0 and collectively an nbyn covariance matrix Σ. For t ≠ s, ε_{t} and ε_{s} are independent.
Θ_{k} is an nbyn matrix of MA coefficients, where k = 1,...,q and Θ_{q} is not a matrix containing only zeros.
Φ_{0} and Θ_{0} are the AR and MA structural coefficients, respectively.
Generally, the time series y_{t} and x_{t} are observable because you have data representing the series. The values of c, δ, β, and the autoregressive matrices Φ_{j} are not always known. You typically want to fit these parameters to your data. See estimate
for ways to estimate unknown parameters or how to hold some of them fixed to values (set equality constraints) during estimation. The innovations ε_{t} are not observable in data, but they can be observable in simulations.
Lag Operator Representation
In the preceding table, the models are represented in differenceequation notation. Lag operator notation is an equivalent and more succinct representation of the multivariate linear time series equations.
The lag operator L reduces the time index by one unit: Ly_{t} = y_{t–1}. The operator L^{j} reduces the time index by j units: L^{j}y_{t} = y_{t–j}.
In lag operator form, the equation for a SVARMAX(p, q) model is:
$$\left({\Phi}_{0}{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{L}^{j}}\right){y}_{t}=c+\beta {x}_{t}+\left({\Theta}_{0}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{L}^{k}}\right){\epsilon}_{t}.$$
The equation is expressed more succinctly in this form:
$$\Phi (L){y}_{t}=c+\beta {x}_{t}+\Theta (L){\epsilon}_{t},$$
where
$$\Phi (L)={\Phi}_{0}{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{L}^{j}}$$
and
$$\Theta (L)={\Theta}_{0}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{L}^{k}}.$$
Stable and Invertible Models
A multivariate AR polynomial is stable if
$$\mathrm{det}\left({I}_{n}{\Phi}_{1}z{\Phi}_{2}{z}^{2}\mathrm{...}{\Phi}_{p}{z}^{p}\right)\ne 0\text{for}\leftz\right\le 1.$$
With all innovations equal to zero, this condition implies that the VAR process converges to c as t approaches infinity (for more details, see [1], Ch. 2).
A multivariate MA polynomial is invertible if
$$\mathrm{det}\left({I}_{n}+{\Theta}_{1}z+{\Theta}_{2}{z}^{2}+\mathrm{...}+{\Theta}_{q}{z}^{q}\right)\ne 0\text{for}\leftz\right\le 1.$$
This condition implies that the pure VAR representation of the VMA process is stable (for more details, see [1], Ch. 11).
A VARMA model is stable if its AR polynomial is stable. Similarly, a VARMA model is invertible if its MA polynomial is invertible.
Models with exogenous inputs (for example, VARMAX models) have no welldefined notion of stability or invertibility. An exogenous input can destabilize a model.
Models with Regression Component
Incorporate feedback from exogenous predictors, or study their linear associations with the response series, by including a regression component in a multivariate linear time series model. By order of increasing complexity, examples of applications that use such models:
Modeling the effects of an intervention, which implies that the exogenous series is an indicator variable.
Modeling the contemporaneous linear associations between a subset of exogenous series to each response. Applications include CAPM analysis and studying the effects of prices of items on their demand. These applications are examples of seemingly unrelated regression (SUR). For more details, see Implement Seemingly Unrelated Regression and Estimate Capital Asset Pricing Model Using SUR.
Modeling the linear associations between contemporaneous and lagged exogenous series and the response as part of a distributed lag model. Applications include determining how a change in monetary growth affects real gross domestic product (GDP) and gross national income (GNI).
Any combination of SUR and the distributed lag model that includes the lagged effects of responses, also known as simultaneous equation models.
The general equation for a VARX(p) model is
$${y}_{t}=c+\delta t+\beta {x}_{t}+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$
where
x_{t} is an mby1 vector of observations from m exogenous variables at time t. The vector x_{t} can contain lagged exogenous series.
β is an nbym vector of regression coefficients. Row j of β contains the regression coefficients in the equation of response series j for all exogenous variables. Column k of β contains the regression coefficients among the response series equations for exogenous variable k. This figure shows the system with an expanded regression component:
$$\left[\begin{array}{c}{y}_{1,t}\\ {y}_{2,t}\\ \vdots \\ {y}_{n,t}\end{array}\right]=c+\delta t+\left[\begin{array}{c}{x}_{1,t}\beta (1,1)+\cdots +{x}_{m,t}\beta (1,m)\\ {x}_{1,t}\beta (2,1)+\cdots +{x}_{m,t}\beta (2,m)\\ \vdots \\ {x}_{1,t}\beta (n,1)+\cdots +{x}_{m,t}\beta (n,m)\end{array}\right]+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}.$$
VAR Model Workflow
This workflow describes how to analyze multivariate time series by using Econometrics Toolbox VAR model functionality. If you believe the response series are cointegrated, use VEC model functionality instead (see vecm
).
Load, preprocess, and partition the data set. For more details, see Multivariate Time Series Data Formats.
Create a
varm
model object that characterizes a VAR model. Avarm
model object is a MATLAB^{®} variable containing properties that describe the model, such as AR polynomial degree p, response dimensionality n, and coefficient values.varm
must be able to infer n and p from your specifications; n and p are not estimable. You can update the lag structure of the AR polynomial after creating a VAR model, but you cannot change n.varm
enables you to create these types of models:Fully specified model in which all parameters, including coefficients and the innovations covariance matrix, are numeric values. Create this type of model when economic theory specifies the values of all parameters in the model, or you want to experiment with parameter settings. After creating a fully specified model, you can pass the model to all object functions except
estimate
.Model template in which n and p are known values, but all coefficients and the innovations covariance matrix are unknown, estimable parameters. Properties corresponding to estimable parameters are composed of
NaN
values. Pass a model template and data toestimate
to obtain an estimated (fully specified) VAR model. Then, you can pass the estimated model to any other object function.Partially specified model template in which some parameters are known, and others are unknown and estimable. If you pass a partially specified model and data to
estimate
, MATLAB treats the known parameter values as equality constraints during optimization, and estimates the unknown values. A partially specified model is well suited to these tasks:Remove lags from the model by setting the coefficient to zero.
Associate a subset of predictors to a response variable by setting to zero the regression coefficients of predictors you do not want in the response equation.
For more details, see Create VAR Model.
For models with unknown, estimable parameters, fit the model to data. See Fitting Models to Data and
estimate
.Find an appropriate AR polynomial degree by iterating steps 2 and 3. See Select Appropriate Lag Order.
Analyze the fitted model. This step can involve:
Determining whether response series Grangercause other response series in the system (see
gctest
).Calculating impulse responses, which are forecasts based on an assumed change in an input to a time series.
VAR model forecasting by obtaining either minimum mean square error forecasts or Monte Carlo forecasts.
Comparing model forecasts to holdout data. For an example, see VAR Model Case Study.
Your application does not have to involve all the steps in this workflow, and you can iterate some of the steps. For example, you might not have any data, but want to simulate responses from a fully specified model.
References
[1] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.
See Also
Objects
Functions
Related Topics
 Multivariate Time Series Data Formats
 Vector Autoregression (VAR) Model Creation
 VAR Model Estimation
 Fit VAR Model to Simulated Data
 Fit VAR Model of CPI and Unemployment Rate
 Estimate Capital Asset Pricing Model Using SUR
 VAR Model Forecasting, Simulation, and Analysis
 Forecast VAR Model
 Forecast VAR Model Using Monte Carlo Simulation
 Simulate Responses of Estimated VARX Model
 VAR Model Case Study