infer
Infer vector error-correction (VEC) model innovations
Description
returns the table or timetable Tbl2
= infer(Mdl
,Tbl1
)Tbl2
containing the multivariate
residuals from evaluating the fully specified VEC(p – 1) model
Mdl
at the response variables in the table or timetable of
data Tbl1
. (since R2022b)
infer
selects the variables in Mdl.SeriesNames
or all variables in Tbl1
. To select different response variables in Tbl1
at which to evaluate the model, use the ResponseVariables
name-value argument.
___ = infer(___,
specifies options using one or more name-value arguments in
addition to any of the input argument combinations in previous syntaxes.
Name,Value
)infer
returns the output argument combination for the
corresponding input arguments. For example, infer(Mdl,Y,Y0=PS,X=Exo)
computes the
residuals of the VEC(p – 1) model Mdl
at the
matrix of response data Y
, and specifies the matrix of presample
response data PS
and the matrix of exogenous predictor data
Exo
.
Supply all input data using the same data type. Specifically:
If you specify the numeric matrix
Y
, optional data sets must be numeric arrays and you must use the appropriate name-value argument. For example, to specify a presample, set theY0
name-value argument to a numeric matrix of presample data.If you specify the table or timetable
Tbl1
, optional data sets must be tables or timetables, respectively, and you must use the appropriate name-value argument. For example, to specify a presample, set thePresample
name-value argument to a table or timetable of presample data.
Examples
Infer VEC Model Innovations From Matrix of Response Data
Consider a VEC model for the following seven macroeconomic series, and then fit the model to a matrix of response data.
Gross domestic product (GDP)
GDP implicit price deflator
Paid compensation of employees
Nonfarm business sector hours of all persons
Effective federal funds rate
Personal consumption expenditures
Gross private domestic investment
Suppose that a cointegrating rank of 4 and one short-run term are appropriate, that is, consider a VEC(1) model.
Load the Data_USEconVECModel
data set.
load Data_USEconVECModel
For more information on the data set and variables, enter Description
at the command line.
Determine whether the data needs to be preprocessed by plotting the series on separate plots.
figure tiledlayout(2,2) nexttile plot(FRED.Time,FRED.GDP) title("Gross Domestic Product") ylabel("Index") xlabel("Date") nexttile plot(FRED.Time,FRED.GDPDEF) title("GDP Deflator") ylabel("Index") xlabel("Date") nexttile plot(FRED.Time,FRED.COE) title("Paid Compensation of Employees") ylabel("Billions of $") xlabel("Date") nexttile plot(FRED.Time,FRED.HOANBS) title("Nonfarm Business Sector Hours") ylabel("Index") xlabel("Date")
figure tiledlayout(2,2) nexttile plot(FRED.Time,FRED.FEDFUNDS) title("Federal Funds Rate") ylabel("Percent") xlabel("Date") nexttile plot(FRED.Time,FRED.PCEC) title("Consumption Expenditures") ylabel("Billions of $") xlabel("Date") nexttile plot(FRED.Time,FRED.GPDI) title("Gross Private Domestic Investment") ylabel("Billions of $") xlabel("Date")
Stabilize all series, except the federal funds rate, by applying the log transform. Scale the resulting series by 100 so that all series are on the same scale.
FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);
Create a VEC(1) model using the shorthand syntax. Specify the variable names.
Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames
Mdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [7×1 vector of NaNs] Adjustment: [7×4 matrix of NaNs] Cointegration: [7×4 matrix of NaNs] Impact: [7×7 matrix of NaNs] CointegrationConstant: [4×1 vector of NaNs] CointegrationTrend: [4×1 vector of NaNs] ShortRun: {7×7 matrix of NaNs} at lag [1] Trend: [7×1 vector of NaNs] Beta: [7×0 matrix] Covariance: [7×7 matrix of NaNs]
Mdl
is a vecm
model object. All properties containing NaN
values correspond to parameters to be estimated given data.
Estimate the model by supplying a matrix of data. Use default options.
EstMdl = estimate(Mdl,FRED.Variables)
EstMdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [14.1329 8.77841 -7.20359 ... and 4 more]' Adjustment: [7×4 matrix] Cointegration: [7×4 matrix] Impact: [7×7 matrix] CointegrationConstant: [-28.6082 109.555 -77.0912 ... and 1 more]' CointegrationTrend: [4×1 vector of zeros] ShortRun: {7×7 matrix} at lag [1] Trend: [7×1 vector of zeros] Beta: [7×0 matrix] Covariance: [7×7 matrix]
EstMdl
is an estimated vecm
model object. It is fully specified because all parameters have known values. By default, estimate
imposes the constraints of the H1 Johansen VEC model form by removing the cointegrating trend and linear trend terms from the model. Parameter exclusion from estimation is equivalent to imposing equality constraints to zero.
Infer innovations from the estimated model, the residuals from the model fit. Supply the matrix of in-sample data.
E = infer(EstMdl,FRED.Variables);
E
is a 238-by-7 matrix of inferred innovations. Columns correspond to the variable names in EstMdl.SeriesNames
.
Alternatively, you can return residuals when you call estimate
by supplying an output variable in the fourth position.
Plot the residuals on separate plots. Synchronize the residuals with the dates by removing the first EstMdl.P
dates.
idx = FRED.Time((EstMdl.P + 1):end); titles = "Residuals: " + EstMdl.SeriesNames; figure tiledlayout(2,2) for j = 1:4 nexttile plot(idx,E(:,j)) hold on yline(0,"r--") hold off title(titles(j)) end
figure tiledlayout(2,2) for j = 5:7 nexttile plot(idx,E(:,j)) hold on yline(0,"r--") hold off title(titles(j)) end
The residuals corresponding to the federal funds rate exhibit heteroscedasticity.
Infer VEC Model Innovations From Timetable of Response Data
Since R2022b
Consider a VEC model for the following seven macroeconomic series, and then fit the model to a timetable of response data. This example is based on Infer VEC Model Innovations From Matrix of Response Data.
Load and Preprocess Data
Load the Data_USEconVECModel
data set.
load Data_USEconVECModel
DTT = FRED;
DTT.GDP = 100*log(DTT.GDP);
DTT.GDPDEF = 100*log(DTT.GDPDEF);
DTT.COE = 100*log(DTT.COE);
DTT.HOANBS = 100*log(DTT.HOANBS);
DTT.PCEC = 100*log(DTT.PCEC);
DTT.GPDI = 100*log(DTT.GPDI);
Prepare Timetable for Estimation
When you plan to supply a timetable directly to estimate, you must ensure it has all the following characteristics:
All selected response variables are numeric and do not contain any missing values.
The timestamps in the
Time
variable are regular, and they are ascending or descending.
Remove all missing values from the table.
DTT = rmmissing(DTT); numobs = height(DTT)
numobs = 240
DTT
does not contain any missing values.
Determine whether the sampling timestamps have a regular frequency and are sorted.
areTimestampsRegular = isregular(DTT,"quarters")
areTimestampsRegular = logical
0
areTimestampsSorted = issorted(DTT.Time)
areTimestampsSorted = logical
1
areTimestampsRegular = 0
indicates that the timestamps of DTT are irregular. areTimestampsSorted = 1
indicates that the timestamps are sorted. Macroeconomic series in this example are timestamped at the end of the month. This quality induces an irregularly measured series.
Remedy the time irregularity by shifting all dates to the first day of the quarter.
dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt;
DTT
is regular with respect to time.
Create Model Template for Estimation
Create a VEC(1) model using the shorthand syntax. Specify the variable names.
Mdl = vecm(7,4,1); Mdl.SeriesNames = DTT.Properties.VariableNames;
Mdl
is a vecm
model object. All properties containing NaN
values correspond to parameters to be estimated given data.
Fit Model to Data
Estimate the model by supplying the timetable of data DTT
. By default, because the number of variables in Mdl.SeriesNames
is the number of variables in DTT
, estimate
fits the model to all the variables in DTT
.
EstMdl = estimate(Mdl,DTT);
EstMdl
is an estimated vecm
model object.
Compute Residuals
Infer innovations from the estimated model, the residuals from the model fit. Supply the timetable of in-sample data DTT
. By default, because the number of variables in Mdl.SeriesNames
is the number of variables in DTT
, infer
selects all the variables in DTT
, from which to compute residuals.
Tbl = infer(EstMdl,DTT); head(Tbl)
Time GDP GDPDEF COE HOANBS FEDFUNDS PCEC GPDI GDP_Residuals GDPDEF_Residuals COE_Residuals HOANBS_Residuals FEDFUNDS_Residuals PCEC_Residuals GPDI_Residuals ___________ ______ ______ ______ ______ ________ ______ ______ _____________ ________________ _____________ ________________ __________________ ______________ ______________ 01-Jul-1957 617.44 281.55 558.01 399.59 3.47 566.71 437.32 0.12076 0.090979 -0.31114 -0.47341 -0.013177 0.14899 1.1764 01-Oct-1957 616.48 281.61 557.48 397.5 2.98 567.26 426.27 -2.4005 -0.39287 -2.1158 -2.1552 -0.86464 -0.89017 -12.289 01-Jan-1958 614.93 282.68 556.15 395.21 1.2 567.09 420.02 -2.0142 0.92195 -1.5874 -1.1852 -1.3247 -0.72797 -4.4964 01-Apr-1958 615.87 282.97 556.03 393.76 0.93 568.09 417.59 0.2131 -0.39586 -0.22658 -0.070487 -0.24993 0.17697 -0.31486 01-Jul-1958 618.76 283.57 558.99 394.95 1.76 569.81 427.67 2.0866 0.45876 2.4738 1.9098 0.98197 1.0195 9.119 01-Oct-1958 621.54 284.04 560.84 396.43 2.42 571.11 438.2 0.68671 0.053454 0.48556 0.63518 0.23659 -0.21548 4.2428 01-Jan-1959 623.66 284.31 563.55 398.35 2.8 573.62 442.12 0.39546 -0.066055 0.97292 1.0224 -0.054929 0.86153 0.68805 01-Apr-1959 626.19 284.46 565.91 400.24 3.39 575.54 449.31 0.24314 -0.22217 0.33889 0.4216 -0.20457 0.26963 -0.15985
size(Tbl)
ans = 1×2
238 14
Tbl
is a 238-by-14 timetable of in-sample data in DTT
and estimated model residuals. Residual variables names are appended with _Residuals
, for example, GDP
_
Residuals
.
Alternatively, you can return residuals when you call estimate
by supplying an output variable in the fourth position.
Infer Innovations from Model Containing Regression Component
Since R2022b
Consider the model and data in Infer VEC Model Innovations From Matrix of Response Data.
Load Data
Load the Data_USEconVECModel
data set.
load Data_USEconVECModel
The Data_Recessions
data set contains the beginning and ending serial dates of recessions. Load this data set. Convert the matrix of date serial numbers to a datetime array.
load Data_Recessions dtrec = datetime(Recessions,ConvertFrom="datenum");
Preprocess Data
Remove the exponential trend from the series, and then scale them by a factor of 100.
DTT = FRED; DTT.GDP = 100*log(DTT.GDP); DTT.GDPDEF = 100*log(DTT.GDPDEF); DTT.COE = 100*log(DTT.COE); DTT.HOANBS = 100*log(DTT.HOANBS); DTT.PCEC = 100*log(DTT.PCEC); DTT.GPDI = 100*log(DTT.GPDI);
Create a dummy variable that identifies periods in which the U.S. was in a recession or worse. Specifically, the variable should be 1
if FRED.Time
occurs during a recession, and 0
otherwise. Include the variable with the FRED
data.
isin = @(x)(any(dtrec(:,1) <= x & x <= dtrec(:,2))); DTT.IsRecession = double(arrayfun(isin,DTT.Time));
Prepare Timetable for Estimation
Remove all missing values from the table.
DTT = rmmissing(DTT);
To make the series regular, shift all dates to the first day of the quarter.
dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt;
DTT
is regular with respect to time.
Create Model Template for Estimation
Create a VEC(1) model using the shorthand syntax. Assume that the appropriate cointegration rank is 4. You do not have to specify the presence of a regression component when creating the model. Specify the variable names.
Mdl = vecm(7,4,1); Mdl.SeriesNames = DTT.Properties.VariableNames(1:end-1);
Fit Model to Data
Estimate the model using the entire sample. Specify the predictor identifying whether the observation was measured during a recession.
EstMdl = estimate(Mdl,DTT,PredictorVariables="IsRecession");
Compute Residuals
Infer innovations from the estimated model. Supply the predictor data. Return the loglikelihood objective function value.
[Tbl,logL] = infer(EstMdl,DTT,PredictorVariables="IsRecession");
head(Tbl)
Time GDP GDPDEF COE HOANBS FEDFUNDS PCEC GPDI IsRecession GDP_Residuals GDPDEF_Residuals COE_Residuals HOANBS_Residuals FEDFUNDS_Residuals PCEC_Residuals GPDI_Residuals ___________ ______ ______ ______ ______ ________ ______ ______ ___________ _____________ ________________ _____________ ________________ __________________ ______________ ______________ 01-Jul-1957 617.44 281.55 558.01 399.59 3.47 566.71 437.32 1 1.1766 0.1075 0.3528 0.15201 0.50983 0.75164 5.1297 01-Oct-1957 616.48 281.61 557.48 397.5 2.98 567.26 426.27 1 -1.2589 -0.375 -1.3979 -1.479 -0.29912 -0.23854 -8.014 01-Jan-1958 614.93 282.68 556.15 395.21 1.2 567.09 420.02 1 -1.2841 0.93338 -1.1283 -0.7527 -0.96303 -0.31126 -1.7628 01-Apr-1958 615.87 282.97 556.03 393.76 0.93 568.09 417.59 0 -0.30176 -0.40391 -0.55035 -0.37547 -0.50497 -0.11691 -2.2427 01-Jul-1958 618.76 283.57 558.99 394.95 1.76 569.81 427.67 0 1.872 0.4554 2.3388 1.7826 0.87564 0.89695 8.3152 01-Oct-1958 621.54 284.04 560.84 396.43 2.42 571.11 438.2 0 0.74477 0.054362 0.52207 0.66957 0.26535 -0.18234 4.4602 01-Jan-1959 623.66 284.31 563.55 398.35 2.8 573.62 442.12 0 0.52785 -0.063984 1.0562 1.1008 0.01065 0.93709 1.1838 01-Apr-1959 626.19 284.46 565.91 400.24 3.39 575.54 449.31 0 0.40825 -0.21958 0.44272 0.5194 -0.12278 0.36387 0.45836
logL
logL = -1.4656e+03
Tbl
is a 238-by-15 timetable of in-sample data in DTT
and inferred innovations (variable names appended with _Residuals
).
Plot the residuals on separate plots. Synchronize the residuals with the dates by removing the first Mdl.P
dates.
idx = endsWith(Tbl.Properties.VariableNames,"_Residuals"); resnames = Tbl.Properties.VariableNames(idx); titles = "Residuals: " + EstMdl.SeriesNames; figure tiledlayout(2,2) for j = 1:4 nexttile plot(Tbl.Time,Tbl{:,resnames(j)}) hold on yline(0,"r--") hold off title(titles(j)) end
figure tiledlayout(2,2) for j = 5:7 nexttile plot(Tbl.Time,Tbl{:,resnames(j)}) hold on yline(0,"r--") hold off title(titles(j)) end
The residuals corresponding to the federal funds rate exhibit heteroscedasticity.
Input Arguments
Y
— Response data
numeric matrix | numeric array
Response data, specified as a
numobs
-by-numseries
numeric matrix or a
numobs
-by-numseries
-by-numpaths
numeric array.
numobs
is the sample size. numseries
is the
number of response series (Mdl.NumSeries
).
numpaths
is the number of response paths.
Rows correspond to observations, and the last row contains the latest observation.
Y
represents the continuation of the presample response series in
Y0
.
Columns must correspond to the response variable names in
Mdl.SeriesNames
.
Pages correspond to separate, independent numseries
-dimensional
paths. Among all pages, responses in a particular row occur at the same time.
Data Types: double
Tbl1
— Time series data
table | timetable
Since R2022b
Time series data containing observed response variables
yt and, optionally, predictor
variables xt for a model with a regression
component, specified as a table or timetable with numvars
variables
and numobs
rows.
Each selected response variable is a
numobs
-by-numpaths
numeric matrix, and each
selected predictor variable is a numeric vector. Each row is an observation, and
measurements in each row occur simultaneously. You can optionally specify
numseries
response variables by using the
ResponseVariables
name-value argument, and you can specify
numpreds
predictor variables by using the
PredictorVariables
name-value argument.
Paths (columns) within a particular response variable are independent, but path
of all variables correspond, for
j
=
1,…,j
numpaths
.
If Tbl1
is a timetable, it must represent a sample with a regular
datetime time step (see isregular
), and the datetime vector
Tbl1.Time
must be ascending or descending.
If Tbl1
is a table, the last row contains the latest
observation.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: infer(Mdl,Y,Y0=PS,X=Exo)
computes the residuals of the
VEC(p – 1) model Mdl
at the matrix of
response data Y
, and specifies the matrix of presample response
data PS
and the matrix of exogenous predictor data
Exo
.
ResponseVariables
— Variables to select from Tbl1
to treat as response variables yt
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2022b
Variables to select from Tbl1
to treat as response variables
yt, specified as one of the following
data types:
String vector or cell vector of character vectors containing
numseries
variable names inTbl1.Properties.VariableNames
A length
numseries
vector of unique indices (integers) of variables to select fromTbl1.Properties.VariableNames
A length
numvars
logical vector, whereResponseVariables(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
, andsum(ResponseVariables)
isnumseries
The selected variables must be numeric vectors (single path) or matrices (columns
represent multiple independent paths) of the same width, and cannot contain missing
values (NaN
).
If the number of variables in Tbl1
matches
Mdl.NumSeries
, the default specifies all variables in
Tbl1
. If the number of variables in Tbl1
exceeds Mdl.NumSeries
, the default matches variables in
Tbl1
to names in Mdl.SeriesNames
.
Example: ResponseVariables=["GDP" "CPI"]
Example: ResponseVariables=[true false true false]
or
ResponseVariable=[1 3]
selects the first and third table
variables as the response variables.
Data Types: double
| logical
| char
| cell
| string
Y0
— Presample responses
numeric matrix | numeric array
Presample responses that provide initial values for the model
Mdl
, specified as a
numpreobs
-by-numseries
numeric matrix or a
numpreobs
-by-numseries
-by-numprepaths
numeric array. Use Y0
only when you supply a numeric array of
response data Y
.
numpreobs
is the number of presample observations.
numprepaths
is the number of presample response paths.
Each row is a presample observation, and measurements in each row, among all pages,
occur simultaneously. The last row contains the latest presample observation.
Y0
must have at least Mdl.P
rows. If you
supply more rows than necessary, infer
uses the latest
Mdl.P
observations only.
Each column corresponds to the response series associated with the respective response
series in Y
.
Pages correspond to separate, independent paths.
If
Y0
is a matrix,infer
applies it to each path (page) inY
. Therefore, all paths inY
derive from common initial conditions.Otherwise,
infer
appliesY0(:,:,
toj
)Y(:,:,
.j
)Y0
must have at leastnumpaths
pages, andinfer
uses only the firstnumpaths
pages.
By default, infer
uses the first Mdl.P
observations, for example, Y(1:Mdl.P,:)
, as a presample. This action
reduces the effective sample size.
Data Types: double
Presample
— Presample data
table | timetable
Since R2022b
Presample data that provides initial values for the model Mdl
,
specified as a table or timetable, the same type as Tbl1
, with
numprevars
variables and numpreobs
rows.
Each row is a presample observation, and measurements in each row, among all paths,
occur simultaneously. numpreobs
must be at least
Mdl.P
. If you supply more rows than necessary,
infer
uses the latest Mdl.P
observations only.
Each variable is a numpreobs
-by-numprepaths
numeric matrix. Variables correspond to the response series associated with the
respective response variable in Tbl1
. To control presample variable
selection, see the optional PresampleResponseVariables
name-value
argument.
For each variable, columns are separate, independent paths.
If variables are vectors,
infer
applies them to each path inTbl1
to produce the corresponding residuals inTbl2
. Therefore, all response paths derive from common initial conditions.Otherwise, for each variable
and each pathResponseK
,j
infer
appliesPresample.
to produceResponseK
(:,j
)Tbl2.
. Variables must have at leastResponseK
(:,j
)numpaths
columns, andinfer
uses only the firstnumpaths
columns.
If Presample
is a timetable, all the following conditions must be true:
Presample
must represent a sample with a regular datetime time step (seeisregular
).The inputs
Tbl1
andPresample
must be consistent in time such thatPresample
immediately precedesTbl1
with respect to the sampling frequency and order.The datetime vector of sample timestamps
Presample.Time
must be ascending or descending.
If Presample
is a table, the last row contains the latest
presample observation.
By default, infer
uses the first or earliest
Mdl.P
observations in Tbl1
as a presample,
and then it fits the model to the remaining numobs – Mdl.P
observations. This action reduces the effective sample size.
PresampleResponseVariables
— Variables to select from Presample
to use for presample response data
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2022b
Variables to select from Presample
to use
for presample data, specified as one of the
following data types:
String vector or cell vector of character vectors containing
numseries
variable names inPresample.Properties.VariableNames
A length
numseries
vector of unique indices (integers) of variables to select fromPresample.Properties.VariableNames
A length
numvars
logical vector, wherePresampleResponseVariables(
selects variablej
) = true
fromj
Presample.Properties.VariableNames
, andsum(PresampleResponseVariables)
isnumseries
The selected variables must be numeric vectors (single path)
or matrices (columns represent multiple independent
paths) of the same width, and cannot contain missing
values (NaN
).
PresampleResponseNames
does not need to
contain the same names as in
Tbl1
;
infer
uses the data in
selected variable
PresampleResponseVariables(
as a presample for the response variable
corresponding to
j
)ResponseVariables(
.j
)
The default specifies the same response variables as those
selected from Tbl1
(see
ResponseVariables
).
Example: PresampleResponseVariables=["GDP"
"CPI"]
Example: PresampleResponseVariables=[true false true
false]
or
PresampleResponseVariable=[1 3]
selects the first and third table variables for
presample data.
Data Types: double
| logical
| char
| cell
| string
X
— Predictor data xt
numeric matrix
Predictor data xt for the regression
component in the model, specified as a numeric matrix containing
numpreds
columns. Use X
only when you supply a
numeric array of response data Y
.
numpreds
is the number of predictor variables
(size(Mdl.Beta,2)
).
Each row corresponds to an observation, and measurements in each row occur
simultaneously. The last row contains the latest observation. X
must
have at least as many observations as Y
. If you supply more rows
than necessary, infer
uses only the latest observations.
infer
does not use the regression component in the
presample period.
If you specify a numeric array for a presample by using
Y0
,X
must have at leastnumobs
rows (seeY
).Otherwise,
X
must have at leastnumobs
–Mdl.P
observations to account for the default presample removal fromY
.
Each column is an individual predictor variable. All predictor variables are present in the regression component of each response equation.
infer
applies X
to each path (page) in
Y
; that is, X
represents one path of
observed predictors.
By default, infer
excludes the regression component,
regardless of its presence in Mdl
.
Data Types: double
PredictorVariables
— Variables to select from Tbl1
to treat as exogenous predictor variables xt
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2022b
Variables to select from Tbl1
to treat as exogenous predictor variables
xt, specified as one of the following data types:
String vector or cell vector of character vectors containing
numpreds
variable names inTbl1.Properties.VariableNames
A length
numpreds
vector of unique indices (integers) of variables to select fromTbl1.Properties.VariableNames
A length
numvars
logical vector, wherePredictorVariables(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
, andsum(PredictorVariables)
isnumpreds
The selected variables must be numeric vectors and cannot contain missing values
(NaN
).
By default, infer
excludes the regression component, regardless
of its presence in Mdl
.
Example: PredictorVariables=["M1SL" "TB3MS" "UNRATE"]
Example: PredictorVariables=[true false true false]
or
PredictorVariable=[1 3]
selects the first and third table variables to
supply the predictor data.
Data Types: double
| logical
| char
| cell
| string
Note
NaN
values inY
,Y0
, andX
indicate missing values.infer
removes missing values from the data by list-wise deletion.If
Y
is a 3-D array, theninfer
horizontally concatenates the pages ofY
to form anumobs
-by-(numpaths*numseries + numpreds)
matrix.If a regression component is present, then
infer
horizontally concatenatesX
toY
to form anumobs
-by-numpaths*numseries + 1
matrix.infer
assumes that the last rows of each series occur at the same time.infer
removes any row that contains at least oneNaN
from the concatenated data.infer
applies steps 1 and 3 to the presample paths inY0
.
This process ensures that the inferred output innovations of each path are the same size and are based on the same observation times. In the case of missing observations, the results obtained from multiple paths of
Y
can differ from the results obtained from each path individually.This type of data reduction reduces the effective sample size.
infer
issues an error when any table or timetable input contains missing values.
Output Arguments
E
— Inferred multivariate innovations series
numeric matrix | numeric array
Inferred multivariate innovations series, returned as either a numeric matrix, or as a
numeric array that contains columns and pages corresponding to Y
.
infer
returns E
only when you supply a
matrix of response data Y
.
If you specify
Y0
, thenE
hasnumobs
rows (seeY
).Otherwise,
E
hasnumobs
–Mdl.P
rows to account for the presample removal.
Tbl2
— Inferred multivariate innovations series
table | timetable
Since R2022b
Inferred multivariate innovations series and other variables, returned as a table or
timetable, the same data type as Tbl1
.
infer
returns Tbl2
only when you
supply the input Tbl1
.
Tbl2
contains the inferred innovation paths E
from evaluating the model Mdl
at the paths of selected response
variables Y
, and it contains all variables in
Tbl1
. infer
names the innovation
variable corresponding to variable
in ResponseJ
Tbl1
. For example, if one
of the selected response variables for estimation in ResponseJ
_ResidualsTbl1
is
GDP
, Tbl2
contains a variable for the
residuals in the response equation of GDP
with the name
GDP_Residuals
.
If you specify presample response data, Tbl2
and
Tbl1
have the same number of rows, and their rows correspond.
Otherwise, because infer
removes initial observations from
Tbl1
for the required presample by default,
Tbl2
has numobs – Mdl.P
rows to account for
that removal.
If Tbl1
is a timetable, Tbl1
and
Tbl2
have the same row order, either ascending or
descending.
logL
— Loglikelihood objective function value
numeric scalar | numeric vector
Loglikelihood objective function value, returned as a numeric scalar or a
numpaths
-element numeric vector.
logL(
corresponds to the
response path in j
)Y(:,:,
or the path
(column) j
)
of the selected response
variables of j
Tbl1
.
Algorithms
Suppose Y
, Y0
, and X
are the
response, presample response, and predictor data specified by the numeric data inputs in
Y
, Y0
, and X
, or the
selected variables from the input tables or timetables Tbl1
and
Presample
.
infer
infers innovations by evaluating the VEC modelMdl
, specificallyinfer
uses this process to determine the time origin t0 of models that include linear time trends.If you do not specify
Y0
, then t0 = 0.Otherwise,
infer
sets t0 tosize(Y0,1)
–Mdl.P
. Therefore, the times in the trend component are t = t0 + 1, t0 + 2,..., t0 +numobs
, wherenumobs
is the effective sample size (size(Y,1)
afterinfer
removes missing values). This convention is consistent with the default behavior of model estimation in whichestimate
removes the firstMdl.P
responses, reducing the effective sample size. Althoughinfer
explicitly uses the firstMdl.P
presample responses inY0
to initialize the model, the total number of observations inY0
andY
(excluding missing values) determines t0.
References
[1] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.
[2] Johansen, S. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press, 1995.
[3] Juselius, K. The Cointegrated VAR Model. Oxford: Oxford University Press, 2006.
[4] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.
Version History
Introduced in R2017bR2022b: infer
accepts input data in tables and timetables, and return results in tables and timetables
In addition to accepting input data in numeric arrays,
infer
accepts input data in tables and timetables. infer
chooses default series on which to operate, but you can use the following name-value arguments to select variables.
ResponseVariables
specifies the response series names in the input data from which residuals are inferred.PredictorVariables
specifies the predictor series names in the input data for a model regression component.Presample
specifies the input table or timetable of presample response data.PresampleResponseVariables
specifies the response series names fromPresample
.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)