# egcitest

Engle-Granger cointegration test

## Syntax

``h = egcitest(Y)``
``````[h,pValue,stat,cValue] = egcitest(Y)``````
``StatTbl = egcitest(Tbl)``
``[___] = egcitest(___,Name=Value)``
``[___,reg1,reg2] = egcitest(___)``

## Description

example

````h = egcitest(Y)` returns the rejection decision `h` from conducting the Engle-Granger cointegration test for assessing the null hypothesis of no cointegration among the variables in the multivariate time series `Y`. `egcitest` forms test statistics by regressing the response data `Y(:,1)` onto the predictor data `Y(:,2:end)`.```

example

``````[h,pValue,stat,cValue] = egcitest(Y)``` also returns the p-value `pValue`, test statistic `stat`, and critical value `cValue` of the test.```

example

````StatTbl = egcitest(Tbl)` returns the table `StatTbl` containing variables for the test results, statistics, and settings from conducting the Engle-Granger cointegration test on the variables of the table or timetable `Tbl`.The response variable in the regression is the first table variable, and all other variables are the predictor variables. To select a different response variable for the regression, use the `ResponseVariable` name-value argument. To select different predictor variables, use the `PredictorNames` name-value argument.```

example

````[___] = egcitest(___,Name=Value)` uses additional options specified by one or more name-value arguments, using any input-argument combination in the previous syntaxes. `egcitest` returns the output-argument combination for the corresponding input arguments. Some options control the number of tests to conduct. The following conditions apply when `egcitest` conducts multiple tests: `egcitest` treats each test as separate from all other tests.If you specify `Y`, all outputs are vectors.If you specify `Tbl`, each row of `StatTbl` contains the results of the corresponding test. For example, ```egcitest(Tbl,ResponseVariable="GDP",Alpha=0.025,Lags=[0 1])``` chooses `GDP` as the response variable from the table `Tbl` and conducts two tests at a level of significance of 0.025. The first test includes `0` lag in the residual regression, and the second test includes `1` lag in the residual regression.```

example

````[___,reg1,reg2] = egcitest(___)` additionally returns the following structures of regression statistics, which are required to form the test statistic: `reg1` – Cointegrating regression statistics`reg2` – Residual regression statistics ```

## Examples

collapse all

Test a multivariate time series for cointegration using the default values of the Engle-Granger cointegration test. Input the time series data as a numeric matrix.

Load data of Canadian inflation and interest rates `Data_Canada.mat`, which contains the series in the matrix `Data`.

```load Data_Canada series'```
```ans = 5x1 cell {'(INF_C) Inflation rate (CPI-based)' } {'(INF_G) Inflation rate (GDP deflator-based)'} {'(INT_S) Interest rate (short-term)' } {'(INT_M) Interest rate (medium-term)' } {'(INT_L) Interest rate (long-term)' } ```

Test the interest rate series for cointegration by using the Engle-Granger cointegration test. Use default options and return the rejection decision and $\mathit{p}$-value.

`h = egcitest(Data(:,3:end))`
```h = logical 0 ```

`egcitest` uses the $\tau$ test by default, and it fails to reject the null hypothesis (`h = 0`) of no cointegration among the interest rate series.

Load data of Canadian inflation and interest rates `Data_Canada.mat`.

`load Data_Canada`

Test the interest rate series for cointegration by using the Engle-Granger cointegration test. Use default options and return the rejection decision, $\mathit{p}$-value, $\tau$-test statistic, and critical value.

`[h,pValue,stat,cValue] = egcitest(Data(:,3:end))`
```h = logical 0 ```
```pValue = 0.0526 ```
```stat = -3.9321 ```
```cValue = -3.9563 ```

Conduct the Engle-Granger cointegration test on a multivariate time series using default options, which use the first table variable as the response, all other table variables as predictors, and includes a constant term in the cointegrating regression. Return a table of test results.

Load data of Canadian inflation and interest rates `Data_Canada.mat`. Convert the table `DataTable` to a timetable.

```load Data_Canada dates = datetime(dates,12,31); TT = table2timetable(DataTable,RowTimes=dates); TT.Observations = [];```

Conduct the Engel-Granger cointegration test by passing the timetable to `egcitest` and using default options. For the cointegrating regression, `egcitest` uses the CPI-based inflation rate as the response variable and all other variables in the timetable as predictors.

`StatTbl = egcitest(TT)`
```StatTbl=1×9 table h pValue stat cValue Lags Alpha Test CReg RReg _____ _________ _______ _______ ____ _____ ______ _____ _______ Test 1 true 0.0023851 -6.2491 -4.7673 0 0.05 {'t1'} {'c'} {'ADF'} ```

`StatTbl` is a table of test results. The rows correspond to variables in the input timetable `TT`, and the columns correspond to the rejection decision, and corresponding $\mathit{p}$-value, decision statistics, and specified test options. In this case, the test rejects the null hypothesis in favor of the alternative of cointegration among all the table variables.

By default, `egcitest` includes all input table variables in the cointegration test. To select a response variable for the cointegrating regression, set the `ResponseVariable` option. To select predictor variables, set the `PredictorVariables` option.

Load data of Canadian inflation and interest rates `Data_Canada.mat`. Convert the table `DataTable` to a timetable of the interest rate series only.

```load Data_Canada dates = datetime(dates,12,31); idxINT = contains(DataTable.Properties.VariableNames,"INT"); TT = table2timetable(DataTable(:,idxINT),RowTimes=dates); TT.Observations = [];```

Plot the interest rate series.

```figure plot(TT.Time,TT.Variables) legend(series(idxINT),Location="northwest") grid on```

Reproduce row 1 of Table II in [3] by testing for cointegration, specifying the default variable assignments for the cointegrating regression and deterministic terms (response variable ${\mathit{y}}_{1}$ is `INT_S`, the other interest rates ${\mathit{y}}_{2}$ and ${\mathit{y}}_{3}$ are predictors, and the model has a constant $\mathit{c}$), and specifying the $\tau$ and $\mathit{z}$ tests. Return the cointegrating regression statistics.

```[StatTbl,reg] = egcitest(TT,Test=["t1" "t2"]); StatTbl```
```StatTbl=2×9 table h pValue stat cValue Lags Alpha Test CReg RReg _____ ________ _______ _______ ____ _____ ______ _____ _______ Test 1 false 0.052627 -3.9321 -3.9563 0 0.05 {'t1'} {'c'} {'ADF'} Test 2 true 0.020157 -25.454 -22.115 0 0.05 {'t2'} {'c'} {'ADF'} ```

The $\tau$ test (`Test 1`) fails to reject the null hypothesis, but the $\mathit{z}$ test (`Test 2`) rejects the null hypothesis in favor of the presence of cointegration.

Plot the estimated cointegrating relation using the regression statistics from the $\mathit{z}$ test ${\mathit{y}}_{1}-\left[\begin{array}{cc}{\mathit{y}}_{2}& {\mathit{y}}_{3}\end{array}\right]\left[\begin{array}{c}{\mathit{b}}_{1}\\ {\mathit{b}}_{2}\end{array}\right]-\mathit{Xa}$, where $\mathit{Xa}=\mathit{c}$.

```c = reg(2).coeff(1); b = reg(2).coeff(2:3); figure plot(TT.Time,TT.Variables*[1; -b] - c) grid on```

## Input Arguments

collapse all

Data representing observations of a multivariate time series yt, specified as a `numObs`-by-`numDims` numeric matrix. Each column of `Y` corresponds to a variable, and each row corresponds to an observation. The test regresses the response variable `Y(:,1)` on the predictor variables `Y(:,2:end)`.

Data Types: `double`

Data representing observations of a multivariate time series yt, specified as a table or timetable with `numObs` rows. Each row of `Tbl` is an observation.

The test regresses the response variable, which is the first variable in `Tbl`, on the predictor variables, which are all other variables in `Tbl`. To select a different response variable for the regression, use the `ResponseVariable` name-value argument. To select different predictor variables, use the `PredictorNames` name-value argument. The selected variables must be numeric.

Note

`egcitest` removes, from the specified data, all observations containing at least one missing observation, represented by a `NaN` value.

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `egcitest(Tbl,ResponseVariable="GDP",Alpha=0.025,Lags=[0 1])` chooses `GDP` as the response variable from the table `Tbl` and conducts two tests at a level of significance of 0.025. The first test includes `0` lag in the residual regression, and the second test includes `1` lag in the residual regression.

Cointegrating regression form, specified as the name of a form, or a string vector or cell vector of form names.

In general, cointegrating regression is

`${y}_{1}=Xa+{Y}_{2}b+\epsilon$`

where y1 is the response variable, Y2 contains the predictor variables, and X is a design matrix for optional deterministic coefficients a, including a constant, linear time trend, and quadratic time trend. This table contains the supported forms and their names.

Form NameDescription
`"nc"`The regression does not include X; no constant or trends.
`"c"`X contains a variable for the constant, but not for the trends.
`"ct"`X contains variables for the constant and the linear time trend.
`"ctt"`X contains variables for the constant, linear time trend, and quadratic time trend.

`egcitest` conducts a separate test for each form name in `CReg`.

Example: `CReg=["ct" "ctt"]` includes a constant and linear time trend terms in the cointegrating regression for the first test, and then includes all three deterministic terms in the cointegrating regression for the second test.

Data Types: `char` | `string` | `cell`

Cointegrating-regression coefficient equality constraints, specified as the numeric vector [a; b] or cell vector of such numeric vectors.

a contains the equality constraints of the deterministic terms in the cointegrating regression. The length of a depends on the corresponding value of the `CReg` name-value argument, one of 0, 1, 2, or 3. For coefficients in the regression, their order in a is constant, linear trend, and quadratic trend.

b contains the `numDims` − 1 equality constraints for the coefficient of the corresponding predictor variable in Y2.

Specify `NaN` entries to estimate the corresponding coefficient in the regression.

When `CVec` is completely specified (does not contain any `NaN` values), `egcitest` does not perform the cointegrating regression.

By default, `CVec` is a completely unspecified cointegrating vector (completely composed of `NaN` values). Consequently, `egcitest` estimates all coefficients.

`egcitest` conducts a separate test for each set of equality constraints in `CVec`.

Example: `egcitest(Tbl,CVec=[2 NaN NaN])` fixes the constant in the cointegrating regression to `2` and estimates the coefficients of the two predictor variables in `Tbl`.

Example: `egcitest(Tbl,CVec={[2 NaN NaN]; nan(3,1))`, for the first test, fixes the constant in the cointegrating regression to `2` and estimates the coefficients of the two predictor variables in `Tbl`, and for the second test, estimates all coefficients.

Example: `egcitest(Tbl,CReg="ctt",CVec=[2 0.5 0.25 NaN NaN])` fixes the constant to `2`, the linear trend to `0.5`, and the quadratic trend to `0.25`, and estimates the coefficients of the two predictor variables in `Tbl`.

Data Types: `double` | `cell`

Residual regression form, specified as the name of a form, or a string vector or cell vector of form names.

Form NameDescription
`"adf"`Augmented Dickey-Fuller test (`adftest`) of residuals from the cointegrating regression
`"pp"`Phillips-Perron test (`pptest`) of residuals from the cointegrating regression

`egcitest` computes test statistics by calling `adftest` and `pptest` with the setting `Model="AR"`. This setting requires residuals from appropriately demeaned and detrended data, which is specified by the cointegrating-regression form `CReg`.

`egcitest` conducts a separate test for each form name in `RReg`.

Example: `CReg=["adf" "pp"]` performs the augmented Dickey-Fuller test for the residual regression of the first test, and then performs the Phillips-Perron test for the residual regression of the second test.

Data Types: `char` | `string` | `cell`

Number of lags in the residual regression, specified as a nonnegative integer or vector of nonnegative integers. The meaning of `Lags` depends on the value of the `RReg` name-value argument. For more details, see the `Lags` argument of the `adftest` and `pptest` functions.

`egcitest` conducts a separate test for each element in `Lags`.

Example: `Lags=[0 1]` includes no lags in the residual regression for the first test, and then includes one lag for the residual regression for the second test.

Data Types: `double`

Test statistic type from residual regression, specified as test name, or a string vector or cell vector of test names. This table contains the supported test names.

Test NameDescription
`"t1"`τ test
`"t2"`z test

For more details, see the `Test` argument of the `adftest` and `pptest` functions.

`egcitest` conducts a separate test for each element in `Test`.

Example: `Test=["t1" "t2"]` computes the τ test from the residual regression for the first test, and then computes the z test from the residual regression for the second test.

Data Types: `char` | `cell` | `string`

Nominal significance level for the hypothesis test, specified as a numeric scalar between `0.001` and `0.999` or a numeric vector of such values.

`egcitest` conducts a separate test for each value in `Alpha`.

Example: `Alpha=[0.01 0.05]` uses a level of significance of `0.01` for the first test, and then uses a level of significance of `0.05` for the second test.

Data Types: `double`

Variable in `Tbl` to use for response in the cointegrating regression, specified as a string vector or cell vector of character vectors containing variable names in `Tbl.Properties.VariableNames`, or an integer or logical vector representing the indices of names. The selected variables must be numeric.

`egcitest` uses the same specified response variable for all tests.

Example: `ResponseVariable="GDP"`

Data Types: `double` | `logical` | `char` | `cell` | `string`

Variables in `Tbl` to use for the predictors in the cointegrating regression, specified as a string vector or cell vector of character vectors containing variable names in `Tbl.Properties.VariableNames`, or an integer or logical vector representing the indices of names. The selected variables must be numeric.

`egcitest` uses the same specified predictors for all tests.

By default, `egcitest` uses all variables in `Tbl` that is not specified by the `ResponseVariable` name-value argument.

Example: `DataVariables=["UN" "CPI"]`

Example: `DataVariables=[true true false false]` or `DataVariables=[1 2]` selects the first and second table variables.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Note

• When `egcitest` conducts multiple tests, the function applies all single settings (scalars or character vectors) to each test.

• All vector-valued specifications that control the number of tests must have equal length.

• If you specify the matrix `Y` and any value is a row vector, all outputs are row vectors.

• A lagged and differenced time series has a reduced sample size. Absent presample values, if the test series yt is defined for t = 1,…,T, the lagged series yt– k is defined for t = k+1,…,T. The first difference applied to the lagged series yt– k further reduces the time base to k+2,…,T. With p lagged differences, the common time base is p+2,…,T and the effective sample size is T–(p+1).

## Output Arguments

collapse all

Test rejection decisions, returned as a logical scalar or vector with length equal to the number of tests. `egcitest` returns `h` when you supply the input `Y`.

• Values of `1` indicate rejection of the null hypothesis in favor of the alternative of cointegration.

• Values of `0` indicate failure to reject the null hypothesis.

Test statistic p-values, returned as a numeric scalar or vector with length equal to the number of tests. `egcitest` returns `pValue` when you supply the input `Y`.

The p-values are left-tailed probabilities.

Test statistics, returned as a numeric scalar or vector with length equal to the number of tests. `egcitest` returns `stat` when you supply the input `Y`.

The `RReg` and `Test` settings of a particular test determine the test statistic. For more details, see `adftest` and `pptest`.

Critical values, returned as a numeric scalar or vector with length equal to the number of tests. `egcitest` returns `cValue` when you supply the input `Y`. The critical values are for left-tailed probabilities.

Because `egcitest` estimates the residuals (that is, residuals are unobserved), critical values are different from those used in `adftest` or `pptest` (unless the cointegrating vector is completely specified by the `CVec` setting). `egcitest` loads tables of critical values from the file `Data_EGCITest.mat`, and then linearly interpolates test critical values from the tables. Critical values in the tables derive from methods described in [3].

Test summary, returned as a table with variables for the outputs `h`, `pValue`, `stat`, and `cValue`, and with a row for each test. `egcitest` returns `StatTbl` when you supply the input `Tbl`.

`StatTbl` contains variables for the test settings specified by `Lags`, `Alpha`, `Test`, `CReg`, and `RReg`.

Regression statistics from the cointegrating regression, returned as a structure array with the number of records equal to the number of tests.

Each element of `reg1` has the fields in this table. You can access a field using dot notation, for example, `reg1(3).coeff` contains the coefficient estimates of the third test.

 `num` Length of input series with `NaN`s removed `size` Effective sample size, adjusted for lags and difference `names` Regression coefficient names `coeff` Estimated coefficient values `se` Estimated coefficient standard errors `Cov` Estimated coefficient covariance matrix `tStats` t statistics of coefficients and p-values `FStat` F statistic and p-value `yMu` Mean of the lag-adjusted input series `ySigma` Standard deviation of the lag-adjusted input series `yHat` Fitted values of the lag-adjusted input series `res` Regression residuals `DWStat` Durbin-Watson statistic `SSR` Regression sum of squares `SSE` Error sum of squares `SST` Total sum of squares `MSE` Mean square error `RMSE` Standard error of the regression `RSq` R2 statistic `aRSq` Adjusted R2 statistic `LL` Loglikelihood of data under Gaussian innovations `AIC` Akaike information criterion `BIC` Bayesian (Schwarz) information criterion `HQC` Hannan-Quinn information criterion

Regression statistics from the residual regression, returned as a structure array with the number of records equal to the number of tests.

`reg2` has the same fields as `reg1`.

## Tips

• To draw valid inferences from the test, determine a suitable value for `Lags`. For more details, see the `adftest` Tips and the `pptest` Tips.

• Samples with less than approximately 20 through 40 observations (depending on the dimension of the data `numDims`) can yield unreliable critical values, and therefore unreliable inferences. See [3].

• If a test result suggests that the time series are cointegrated, you can use the residuals as data for the error-correction term in a VEC representation of the variables. Follow this procedure:

1. Extract the residuals from the `reg1` output (`reg1.res`).

2. Estimate autoregressive model components using the `estimate` function of `varm`, and treat the extracted residual series as exogenous for estimation.

## Alternative Functionality

### App

The Econometric Modeler app enables you to conduct the Engle-Granger cointegration test.

## References

[1] Engle, R. F. and C. W. J. Granger. "Co-Integration and Error-Correction: Representation, Estimation, and Testing." Econometrica. Vol. 55, 1987, pp. 251–276.

[2] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[3] MacKinnon, J. G. "Numerical Distribution Functions for Unit Root and Cointegration Tests." Journal of Applied Econometrics. Vol. 11, 1996, pp. 601–618.

## Version History

Introduced in R2011a