kpsstest

KPSS test for stationarity

Syntax

h = kpsstest(y)

[h,pValue,stat,cValue]
= kpsstest(y)

StatTbl = kpsstest(Tbl)

[___] = kpsstest(___,Name=Value)

[___,reg] = kpsstest(___)

Description

h = kpsstest(y) returns rejection decision from conducting the Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test for a unit root in the input univariate time series.

example

[h,pValue,stat,cValue] = kpsstest(y) also returns the p-value pValue, test statistic stat, and critical value cValue of the test.

example

StatTbl = kpsstest(Tbl) returns a table containing variables for the test results, statistics, and settings from conducting the KPSS test for a unit root in the last variable of the input table or timetable Tbl. To select a different variable in Tbl to test, use the DataVariable name-value argument.

example

[___] = kpsstest(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. kpsstest returns the output argument combination for the corresponding input arguments.

Some options control the number of tests to conduct. The following conditions apply when kpsstest conducts multiple tests:

kpsstest treats each test as separate from all other tests.
If you specify y, all outputs are vectors.
If you specify Tbl, each row of StatTbl contains the results of the corresponding test.

For example, kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1]) conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable GDP of the table Tbl. The first test includes 0 autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes 1 autocovariance lag.

example

[___,reg] = kpsstest(___) additionally returns a structure of regression statistics for the hypothesis test reg.

example

Examples

collapse all

Conduct KPSS Test on Vector of Data

Open Live Script

Test a time series for a unit root using the default options of kpsstest. Input the time series data as a numeric vector.

Load the Nelson-Plosser macroeconomic series data set. Plot the real gross national product (RGNP).

load Data_NelsonPlosser
rgnp = DataTable.GNPR;
dt = datetime(dates,ConvertFrom="datenum");

plot(dt,rgnp)
title("Real Gross National Product")

Figure contains an axes object. The axes object with title Real Gross National Product contains an object of type line.

The series exhibits exponential growth.

Linearize the RGNP series.

linRGNP = log(rgnp);

Assess the null hypothesis of the KPSS test, which is that the series is trend stationary. Use default options.

h = kpsstest(linRGNP)

h = logical
   1

h = 1 indicates that, at a 5% level of significance, the test rejects the null hypothesis that the linearized Real GNP series is trend stationary, which suggests that the series is unit root nonstationary.

Return Test p-Value and Decision Statistics

Open Live Script

Load the Nelson-Plosser Macroeconomic series data set, and linearize the RGNP series.

load Data_NelsonPlosser
linRGNP = log(DataTable.GNPR);

Assess the null hypothesis that the series is trend stationary. Return the test decision, $p$ -value, test statistic, and critical value.

[h,pValue,stats,cValue] = kpsstest(linRGNP)

h = logical
   1

pValue = 
0.0100

stats = 
0.6299

cValue = 
0.1460

Conduct KPSS Test on Table Variable

Open Live Script

Test whether a time series, which is one variable in a table, is trend stationary using the default options.

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table DataTable. Linearize the RGNP series by applying the log transformation, and store the result in DataTable.

load Data_NelsonPlosser
DataTable.LinRGNP = log(DataTable.GNPR);
DataTable.Properties.VariableNames{end}

ans = 
'LinRGNP'

Test the null hypothesis that the linearized RGNP series is trend stationary.

StatTbl = kpsstest(DataTable)

StatTbl=1×7 table
                h      pValue     stat      cValue    Lags    Alpha    Trend
              _____    ______    _______    ______    ____    _____    _____

    Test 1    true      0.01     0.62989    0.146      0      0.05     true

kpsstest returns test results and settings in the table StatTbl, where variables correspond to test results (h, pValue, stat, and cValue) and settings (Lags, Alpha, Trend), and rows correspond to individual tests (in this case, kpsstest conducts one test).

By default, kpsstest tests the last variable in the table. To select a variable from an input table to test, set the DataVariable option.

Specify Lags for Newey-West Estimator by Testing Up

Open Live Script

Conduct multiple tests on the linearized RGNP series that reproduce the first row of the second half of Table 5 in [2].

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table DataTable. Apply the log transformation to all variables in the table.

load Data_NelsonPlosser
LogDT = varfun(@log,DataTable);
LogDT.Properties.VariableNames{end}

ans = 
'log_SP'

varfun applies log to all variables in DataTable, prepends log_ to all transformed variable names, and stores the result in the table LogDT. The final variable is the log of the stock price index series (SP).

Assess the null hypothesis that the linearized RGNP series is trend stationary over a range of lags. Specify the variable name of the linearized RGNP series log_GNPR.

lags = (0:8);
StatTbl = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags)

StatTbl=9×7 table
                h       pValue      stat      cValue    Lags    Alpha    Trend
              _____    ________    _______    ______    ____    _____    _____

    Test 1    true         0.01    0.62989    0.146      0      0.05     true 
    Test 2    true         0.01    0.33666    0.146      1      0.05     true 
    Test 3    true         0.01    0.24209    0.146      2      0.05     true 
    Test 4    true       0.0169     0.1976    0.146      3      0.05     true 
    Test 5    true     0.027579    0.17291    0.146      4      0.05     true 
    Test 6    true      0.04015    0.15782    0.146      5      0.05     true 
    Test 7    true     0.048417     0.1479    0.146      6      0.05     true 
    Test 8    false     0.05886    0.14122    0.146      7      0.05     true 
    Test 9    false    0.066757    0.13695    0.146      8      0.05     true

The tests corresponding to 0 $\leq$ lags $\leq$ 2 produce $p$ -values that are less than 0.01. For 2 < lags < 7, the tests indicate sufficient evidence to suggest that log RGNP is unit root nonstationary (as opposed to the series being trend stationary) at the default 5% level.

Select Newey-West Estimator Lags Using Sample Size

Open Live Script

Test whether the wage series in the manufacturing sector (1900–1970) has a unit root. Use the advice in [2] to select the number of lags in the Newey-West estimator of the coefficient standard errors.

Load the Nelson-Plosser macroeconomic data set. Remove all missing values from the data relative to the wage series WN.

load Data_NelsonPlosser
[DataTable,idx] = rmmissing(DataTable,DataVariables="WN");
dt = dates(~idx);

Compute the effective sample size $T$ and its square root, where the latter is approximately the number of lags recommended for the Newey-West estimator.

T = height(DataTable);
sqrtT = sqrt(T);

Plot the wage series.

plot(dt,DataTable.WN)
title("Wages")

Figure contains an axes object. The axes object with title Wages contains an object of type line.

The wage series appears to grow exponentially.

Linearize the wages series by applying the log transformation to all variables in the table.

LogDT = varfun(@log,DataTable);
plot(dt,LogDT.log_WN)
title("Log Wages")

Figure contains an axes object. The axes object with title Log Wages contains an object of type line.

The log wage series appears to have a linear trend.

Test the null hypothesis that the log wage series is trend stationary (no unit root) against the alternative hypothesis that the log wage series is difference stationary. Conduct the test by setting a range of lags for the Newey-West estimator around $\sqrt{T}$ .

StatTbl = kpsstest(LogDT,DataVariable="log_WN",Lags=7:10)

StatTbl=4×7 table
                h      pValue      stat      cValue    Lags    Alpha    Trend
              _____    ______    ________    ______    ____    _____    _____

    Test 1    false     0.1       0.10678    0.146       7     0.05     true 
    Test 2    false     0.1       0.10074    0.146       8     0.05     true 
    Test 3    false     0.1      0.096634    0.146       9     0.05     true 
    Test 4    false     0.1      0.094058    0.146      10     0.05     true

All tests fail to reject the null hypothesis that the log wages series is trend stationary.

The $p$ -values are larger than 0.1. The software compares the test statistic to critical values and computes $p$ -values that it interpolates from tables in [2].

Inspect Regression Statistics

Open Live Script

Load the Nelson-Plosser macroeconomic series data set. Apply the log transformation to all variables in the table.

load Data_NelsonPlosser
LogDT = varfun(@log,DataTable);

Assess the null hypothesis that the linearized RGNP series is trend stationary. Use the Trend option to conduct the test with (true) and without (false) a deterministic time trend term in the response model. Return the regression statistics.

[~,reg] = kpsstest(LogDT,DataVariable="log_GNPR",Trend=[true false]);

reg is a structure array of length 2 with fields that store the OLS regression results. Each element corresponds to a test.

Compare the coefficient estimates.

withTrend = reg(1).coeff

withTrend = 2×1

    4.5834
    0.0310

woTrend = reg(2).coeff

woTrend = 
5.5595

For the first test, the response model for the regression includes a trend term, so the regression coefficients withTrend include a model intercept (under the null hypothesis) 4.5834 and the coefficient of the time trend 0.0310. For the second test, the response model includes an intercept only for the regression, so the intercept woTrend is 5.5595.

Display the coefficient standard errors for the first test.

reg(1).se

ans = 2×1

    0.0344
    0.0010

The Lags option includes autocovariance lags in the Newey-West estimator of the long-run variance. Therefore, the option does not affect the estimated OLS coefficients, standard errors, or MSE.

Conduct a KPSS test for each lag from 0 through 4. Compare the standard OLS and the Newey-West estimates.

lags = 0:4;
[~,regLags] = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags);

coeffs = table(regLags.coeff,VariableNames="Lags_"+lags, ...
    RowNames=["Intercept" "Trend"]);
se = table(regLags.se,VariableNames="Lags_"+lags, ...
    RowNames=["SE_Intercept" "SE_Trend"]);
mse = table(regLags.MSE,VariableNames="Lags_"+lags, ...
    RowNames="MSE");
nw = table(regLags.NWEst,VariableNames="Lags_"+lags, ...
    RowNames="NWVar");
[coeffs; se; mse; nw]

ans=6×5 table
                      Lags_0        Lags_1        Lags_2        Lags_3        Lags_4  
                    __________    __________    __________    __________    __________

    Intercept           4.5834        4.5834        4.5834        4.5834        4.5834
    Trend             0.030988      0.030988      0.030988      0.030988      0.030988
    SE_Intercept       0.03443       0.03443       0.03443       0.03443       0.03443
    SE_Trend        0.00095035    0.00095035    0.00095035    0.00095035    0.00095035
    MSE               0.017933      0.017933      0.017933      0.017933      0.017933
    NWVar             0.017354       0.03247      0.045154      0.055321      0.063222

Input Arguments

collapse all

`y` — Univariate time series data
numeric vector

Univariate time series data, specified as a numeric vector. Each element of y represents an observation.

Data Types: double

`Tbl` — Time series data
table | timetable

Time series data, specified as a table or timetable. Each row of Tbl is an observation.

Specify a single series (variable) to test by using the DataVariable argument. The selected variable must be numeric.

Note

kpsstest removes missing observations, represented by NaN values, from the input series.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1]) conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable GDP of the table Tbl. The first test includes 0 autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes 1 autocovariance lag.

`Lags` — Number of autocovariance lags
0 (default) | nonnegative integer | vector of nonnegative integers

Number of autocovariance lags to include in the Newey-West estimator of the long-run variance, specified as a nonnegative integer or vector of nonnegative integers. If Lags(j) > 0, kpsstest includes lags 1 through Lags(j) in the estimator for test j.

kpsstest conducts a separate test for each element in Lags.

Example: Lags=0:2 includes zero lagged autocovariance terms in the Newey-West estimator for the first test, the lag 1 autocovariance term for the second test, and autocovariance lags 1 and 2 in the third test.

Data Types: double

`Trend` — Flag for including deterministic trend term δt
`true` (default) | `false` | logical vector

Flag for including deterministic trend δt in the model, specified as a logical scalar or vector.

kpsstest conducts a separate test for each element in Trend.

Example: Trend=false excludes δt from the response model for all tests.

Data Types: logical

`Alpha` — Significance level
0.05 (default) | numeric scalar | numeric vector

Significance level for the hypothesis test, specified as a numeric scalar or vector with entries between 0.01 and 0.10.

kpsstest conducts a separate test for each element in Alpha.

Example: Alpha=[0.01 0.05] uses a level of significance of 0.01 for the first test, and then uses a level of significance of 0.05 for the second test.

Data Types: double

`DataVariable` — Variable in `Tbl` to test
last variable (default) | string scalar | character vector | integer | logical vector

Variable in Tbl to test, specified as a string scalar or character vector containing a variable name in Tbl.Properties.VariableNames, or an integer or logical vector representing the index of a name. The selected variable must be numeric.

Example: DataVariable="GDP"

Example: DataVariable=[false true false false] or DataVariable=2 tests the second table variable.

Data Types: double | logical | char | string

Note

When kpsstest conducts multiple tests, the function applies all single settings (scalars or character vectors) to each test.
All vector-valued specifications that control the number of tests must have equal length.
If you specify the vector y and any value is a row vector, all outputs are row vectors.

Output Arguments

collapse all

`h` — Test rejection decisions
logical scalar | logical vector

Test rejection decisions, returned as a logical scalar or vector with length equal to the number of tests. kpsstest returns h when you supply the input y.

Values of 1 indicate rejection of the trend-stationary null hypothesis in favor of the unit root alternative.
Values of 0 indicate failure to reject the trend-stationary null hypothesis.

`pValue` — Test statistic p-values
numeric scalar | numeric vector

Test statistic p-values, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns pValue when you supply the input y.

The p-values are right-tail probabilities.

When test statistics are outside tabulated critical values, kpsstest returns maximum (0.10) or minimum (0.01) p-values.

`stat` — Test statistics
numeric scalar | numeric vector

Test statistics, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns stat when you supply the input y.

kpsstest computes test statistics by using an ordinary least squares (OLS) regression (for more details, see KPSS Test).

If you set Trend=false, kpsstest regresses y on an intercept.
Otherwise, kpsstest regresses y on an intercept and trend term.

`cValue` — Critical values
numeric scalar | numeric vector

Critical values, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns cValue when you supply the input y.

Critical values are for right-tail probabilities.

`StatTbl` — Test summary
table

Test summary, returned as a table with variables for the outputs h, pValue, stat, and cValue, and with a row for each test. kpsstest returns StatTbl when you supply the input Tbl.

StatTbl contains variables for the test settings specified by Lags, Alpha, and Trend.

`reg` — Regression statistics
structure array

Regression statistics for OLS estimation of the coefficients in the model, returned as a structure array with the number of records equal to the number of tests.

Each element of reg has the fields in this table. You can access a field using dot notation, for example, reg(1).coeff contains the coefficient estimates of the first test.

Field	Description
`num`	Length of input series with `NaN`s removed
`size`	Effective sample size T, adjusted for lags
`names`	Regression coefficient names
`coeff`	Estimated coefficient values
`se`	Estimated coefficient standard errors
`Cov`	Estimated coefficient covariance matrix
`tStats`	t statistics of coefficients and p-values
`FStat`	F statistic and p-value
`yMu`	Mean of the lag-adjusted input series
`ySigma`	Standard deviation of the lag-adjusted input series
`yHat`	Fitted values of the lag-adjusted input series
`res`	Regression residuals
`autoCov`	Estimated residual autocovariances
`NWEst`	Newey-West coefficient standard error estimates
`DWStat`	Durbin-Watson statistic
`SSR`	Regression sum of squares
`SSE`	Error sum of squares
`SST`	Total sum of squares
`MSE`	Mean square error
`RMSE`	Standard error of the regression
`RSq`	R² statistic
`aRSq`	Adjusted R² statistic
`LL`	Loglikelihood of data under Gaussian innovations
`AIC`	Akaike information criterion
`BIC`	Bayesian (Schwarz) information criterion
`HQC`	Hannan-Quinn information criterion

More About

collapse all

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Test

The KPSS test assesses the null hypothesis that a univariate time series is trend stationary against the alternative that it is a nonstationary unit root process.

The test uses the structural model

$\begin{array}{l} y_{t} = c_{t} + δ t + u_{1 t} \\ c_{t} = c_{t - 1} + u_{2 t}, \end{array}$

where

δ is the trend coefficient (see the Trend argument).
u_1t is a stationary process.
u_2t is an independent and identically distributed process with mean 0 and variance σ².

The null hypothesis is that σ² = 0, which implies that the random walk term (c_t) is constant and acts as the model intercept. The alternative hypothesis is that σ² > 0, which introduces the unit root in the random walk.

An OLS regression of y_t onto X_t yields the residual series {e_t}, where X_t has one of the following forms:

X_t = 1 for all t when Trend is false.
X_t = [1 δt] when Trend is true.

The test statistic is

$\frac{\sum_{t = 1}^{T} S_{t}^{2}}{s^{2} T^{2}},$

where

T is the effective sample size.
s² is the Newey-West estimate of the long-run variance.
s_T = e₁ + e₂ + … + e_T.

Tips

To draw valid inferences from a KPSS test, you must determine a suitable value for the Lags argument. The following methods can determine a suitable number of lags:
- Begin with a small number of lags, and then evaluate the sensitivity of the results by adding more lags.
- Kwiatkowski et al. [2] suggest that a number of lags on the order of $\sqrt{T}$ , where T is the effective sample size, is often satisfactory under both the null and the alternative.
For consistency of the Newey-West estimator, the number of lags must approach infinity as the sample size increases.
With a specific testing strategy in mind, determine the value of the Trend argument by the growth characteristics of the input time series.
- If the input series grows, include a trend term by setting Trend to true (default). This setting provides a reasonable comparison of a trend stationary null and a unit root process with drift.
- If a series does not exhibit long-term growth characteristics, exclude a trend term by setting Trend to false.

Algorithms

Test statistics follow nonstandard distributions under the null, even asymptotically. Kwiatkowski et al. [2] use Monte Carlo simulations, for models with and without a trend, to tabulate asymptotic critical values for a standard set of significance levels between 0.01 and 0.1. kpsstest interpolates critical values and p-values from these tables.

References

[1] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[2] Kwiatkowski, D., P. C. B. Phillips, P. Schmidt, and Y. Shin. “Testing the Null Hypothesis of Stationarity against the Alternative of a Unit Root.” Journal of Econometrics. Vol. 54, 1992, pp. 159–178.

[3] Newey, W. K., and K. D. West. "A Simple, Positive Semidefinite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix." Econometrica. Vol. 55, 1987, pp. 703–708.

Version History

Introduced in R2009b

kpsstest

Syntax

Description

Examples

Conduct KPSS Test on Vector of Data

Return Test p-Value and Decision Statistics

Conduct KPSS Test on Table Variable

Specify Lags for Newey-West Estimator by Testing Up

Select Newey-West Estimator Lags Using Sample Size

Inspect Regression Statistics

Input Arguments

`y` — Univariate time series data
numeric vector

`Tbl` — Time series data
table | timetable

Name-Value Arguments

`Lags` — Number of autocovariance lags
0 (default) | nonnegative integer | vector of nonnegative integers

`Trend` — Flag for including deterministic trend term δt
`true` (default) | `false` | logical vector

`Alpha` — Significance level
0.05 (default) | numeric scalar | numeric vector

`DataVariable` — Variable in `Tbl` to test
last variable (default) | string scalar | character vector | integer | logical vector

Output Arguments

`h` — Test rejection decisions
logical scalar | logical vector

`pValue` — Test statistic p-values
numeric scalar | numeric vector

`stat` — Test statistics
numeric scalar | numeric vector

`cValue` — Critical values
numeric scalar | numeric vector

`StatTbl` — Test summary
table

`reg` — Regression statistics
structure array

More About

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Test

Tips

Algorithms

References

Version History

See Also

Topics

kpsstest

Syntax

Description

Examples

Conduct KPSS Test on Vector of Data

Return Test p-Value and Decision Statistics

Conduct KPSS Test on Table Variable

Specify Lags for Newey-West Estimator by Testing Up

Select Newey-West Estimator Lags Using Sample Size

Inspect Regression Statistics

Input Arguments

y — Univariate time series data numeric vector

Tbl — Time series data table | timetable

Name-Value Arguments

Lags — Number of autocovariance lags 0 (default) | nonnegative integer | vector of nonnegative integers

Trend — Flag for including deterministic trend term δt true (default) | false | logical vector

Alpha — Significance level 0.05 (default) | numeric scalar | numeric vector

DataVariable — Variable in Tbl to test last variable (default) | string scalar | character vector | integer | logical vector

Output Arguments

h — Test rejection decisions logical scalar | logical vector

pValue — Test statistic p-values numeric scalar | numeric vector

stat — Test statistics numeric scalar | numeric vector

cValue — Critical values numeric scalar | numeric vector

StatTbl — Test summary table

reg — Regression statistics structure array

More About

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Test

Tips

Algorithms

References

Version History

See Also

Topics

`y` — Univariate time series data
numeric vector

`Tbl` — Time series data
table | timetable

`Lags` — Number of autocovariance lags
0 (default) | nonnegative integer | vector of nonnegative integers

`Trend` — Flag for including deterministic trend term δt
`true` (default) | `false` | logical vector

`Alpha` — Significance level
0.05 (default) | numeric scalar | numeric vector

`DataVariable` — Variable in `Tbl` to test
last variable (default) | string scalar | character vector | integer | logical vector

`h` — Test rejection decisions
logical scalar | logical vector

`pValue` — Test statistic p-values
numeric scalar | numeric vector

`stat` — Test statistics
numeric scalar | numeric vector

`cValue` — Critical values
numeric scalar | numeric vector

`StatTbl` — Test summary
table

`reg` — Regression statistics
structure array