bj

Estimate Box-Jenkins polynomial model using time-domain data

Syntax

sys = bj(tt,[nb
nc nd nf nk])

sys = bj(u,y,[nb nc nd nf nk])

sys = bj(data,[nb nc nd nf nk])

sys = bj(___, Name,Value)

sys = bj(tt, init_sys)

sys = bj(u,y,init_sys)

sys = bj(___, opt)

[sys,ic] = bj(___)

Description

Box-Jenkins (BJ) models are a special configuration of polynomial models that provide completely independent parameterization for the dynamics and noise using rational polynomial functions. BJ models, which are always discrete-time models, can be estimated only from time-domain data. Use BJ models when the noise is primarily a measurement disturbance rather than an input disturbance. The BJ structure provides additional flexibility for modeling the noise.

Estimate Box-Jenkins Model

sys = bj(tt,[nb nc nd nf nk]) estimates a Box-Jenkins polynomial model sys using the data contained in the variables of timetable tt. The software uses the first Nu variables as inputs and the next Ny variables as outputs, where Nu and Ny are determined from the dimensions of the specified polynomial orders.

sys is represented by the equation

$y (t) = \sum_{i = 1}^{n u} \frac{B_{i} (q)}{F_{i} (q)} u_{i} (t - n k_{i}) + \frac{C (q)}{D (q)} e (t)$

Here, y(t) is the output, u(t) is the input, and e(t) is the error.

The components of [nb nc nd nf nk] define the orders of the polynomials used for estimation. For more information about the Box-Jenkins model structure, see Box-Jenkins Model Structure.

To select specific input and output channels from tt, use name-value syntax to set 'InputName' and 'OutputName' to the corresponding timetable variable names.

sys = bj(u,y,[nb nc nd nf nk]) uses the time-domain input and output signals in the comma-separated matrices u,y. The software assumes that the data sample time is 1 second. To change the sample time, set Ts using name-value syntax.

sys = bj(data,[nb nc nd nf nk]) uses the time-domain data in the iddata object data. Use this syntax especially when you want to take advantage of the additional information, such as data sample time or experiment labeling, that data objects provide.

example

sys = bj(___, Name,Value) specifies model structure attributes using additional options specified by one or more name-value arguments. You can use this syntax with any of the previous input-argument combinations.

Configure Initial Parameters

sys = bj(tt, init_sys) uses the polynomial model init_sys to configure the initial parameterization of sys for estimation using the timetable tt.

sys = bj(u,y,init_sys) uses the matrix data u,y for estimation.

sys = bj(u,y,init_sys) uses the data object data, for estimation.

Specify Additional Estimation Options

sys = bj(___, opt) incorporates an option set opt that specifies options such as handling of initial conditions, regularization, and numerical search method to use for estimation.

example

Return Estimated Initial Conditions

[sys,ic] = bj(___) returns the estimated initial conditions as an initialCondition object. Use this syntax if you plan to simulate or predict the model response using the same estimation input data and then compare the response with the same estimation output data. Incorporating the initial conditions yields a better match during the first part of the simulation.

Examples

collapse all

Identify SISO Box-Jenkins Model

Open Live Script

Estimate the parameters of a single-input, single-output Box-Jenkins model from measured data.

load sdata1 tt1;
nb = 2;
nc = 2;
nd = 2;
nf = 2;
nk = 1;
sys = bj(tt1,[nb nc nd nf nk]);

sys is a discrete-time idpoly model with estimated coefficients. The order of sys is as described by nb, nc, nd, nf, and nk.

Compare the model output to the estimation data.

compare(sys,tt1)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 70.75%.

Use getpvec to obtain the estimated parameters and getcov to obtain the covariance associated with the estimated parameters.

Estimate Multi-Input, Single-Output Box-Jenkins Model

Open Live Script

Estimate the parameters of a multi-input, single-output Box-Jenkins model from measured data.

Load the data tt8, which is in the form of a timetable containing three input channels and one output channel. Separate tt8 into estimation and validation portions.

load sdata8.mat tt8
tt8e = tt8(1:250,:);
tt8v = tt8(251:end,:);

Specify the orders and delays. Then, estimate the model sys.

nb = [2 1 1];
nc = 1;
nd = 1;
nf = [2 1 2] + 1;
nk = [1 0 2];
sys=bj(tt8e,[nb nc nd nf nk]);

Compare the output of sys with the validation data.

compare(tt8v,sys)

Figure contains an axes object. The axes object with ylabel y contains 2 objects of type line. These objects represent Validation data (y), sys: 81.23%.

Estimate Box-Jenkins Model Using Regularization

Open Live Script

Estimate a regularized BJ model by converting a regularized ARX model.

Load data.

load regularizationExampleData.mat m0simdata;

Estimate an unregularized BJ model of order 30.

m1 = bj(m0simdata(1:150),[15 15 15 15 1]);

Estimate a regularized BJ model by determining Lambda value by trial and error.

opt = bjOptions;
opt.Regularization.Lambda = 1;
m2 = bj(m0simdata(1:150),[15 15 15 15 1],opt);

Obtain a lower-order BJ model by converting a regularized ARX model followed by order reduction.

opt1 = arxOptions;
[L,R] = arxRegul(m0simdata(1:150),[30 30 1]);
opt1.Regularization.Lambda = L;
opt1.Regularization.R = R;
m0 = arx(m0simdata(1:150),[30 30 1],opt1);
mr = idpoly(balred(idss(m0),7));

Compare the model outputs against data.

opt2 = compareOptions('InitialCondition','z');
compare(m0simdata(150:end),m1,m2,mr,opt2);

Figure contains an axes object. The axes object with ylabel y1 contains 4 objects of type line. These objects represent Validation data (y1), m1: 52.83%, m2: 57.59%, mr: 64.91%.

Configure Estimation Options

Open Live Script

Estimate the parameters of a single-input, single-output Box-Jenkins model while configuring some estimation options.

Generate estimation data.

B = [0 1 0.5];
C = [1 -1 0.2];
D = [1 1.5 0.7];
F = [1 -1.5 0.7];
sys0 = idpoly(1,B,C,D,F,0.1);
e = randn(200,1);
u = idinput(200); 
y = sim(sys0,[u e]);

data is a single-input, single-output data set created by simulating a known model.

Estimate initial Box-Jenkins model.

nb = 2;
nc = 2;
nd = 2;
nf = 2;
nk = 1;
init_sys = bj(u,y,[2 2 2 2 1]);

Create an estimation option set to refine the parameters of the estimated model.

opt = bjOptions;
opt.Display = 'on';
opt.SearchOptions.MaxIterations = 50;

opt is an estimation option set that configures the estimation to iterate 50 times at most and display the estimation progress.

Re-estimate the model parameters using the estimation option set.

sys = bj(u,y,init_sys,opt);

sys is estimated using init_sys for the initial parameterization for the polynomial coefficients.

To view the estimation result, enter sys.Report.

Estimate MIMO Box-Jenkins Model

Open Live Script

Estimate a multi-input, multi-output Box-Jenkins model from estimated data.

Load measured data.

load iddata1 z1
load iddata2 z2
data = [z1 z2(1:300)];

data contains the measured data for two inputs and two outputs.

Estimate the model.

 nb = [2 2; 3 4];  
 nc = [2;2]; 
 nd = [2;2]; 
 nf = [1 0; 2 2];
 nk = [1 1; 0 0];
 sys = bj(data,[nb nc nd nf nk]);

The polynomial order coefficients contain one row for each output.

sys is a discrete-time idpoly model with two inputs and two outputs.

Obtain Initial Conditions

Open Live Script

Load the data.

load iddata1ic z1i

Estimate a second-order Box-Jenkins model sys and return the initial conditions in ic.

nb = 2;
nc = 2;
nd = 2;
nf = 2;
nk = 1;
[sys,ic] = bj(z1i,[nb nc nd nf nk]);
ic

ic = 
  initialCondition with properties:

     A: [4×4 double]
    X0: [4×1 double]
     C: [0.8744 0.5426 0.4647 -0.5285]
    Ts: 0.1000

ic is an initialCondition object that encapsulates the free response of sys, in state-space form, to the initial state vector in X0. You can incorporate ic when you simulate sys with the z1i input signal and compare the response with the z1i output signal.

Input Arguments

collapse all

`tt` — Timetable-based estimation data
timetable | cell array of timetables.

Estimation data, specified as a timetable that uses a regularly spaced time vector. tt contains variables representing input and output channels. For multiexperiment data, tt is a cell array of timetables of length Ne, where Ne is the number of experiments.

The software determines the number of input and output channels to use for estimation from the dimensions of the specified polynomial orders. The input/output channel selection depends on whether the 'InputName' and 'OutputName' name-value arguments are specified.

If 'InputName' and 'OutputName' are not specified, then the software uses the first Nu variables of tt as inputs and the next Ny variables of tt as outputs.
If 'InputName' and 'OutputName' are specified, then the software uses the specified variables. The number of specified input and output names must be consistent with Nu and Ny.
For functions that can estimate a time series model, where there are no inputs, 'InputName' does not need to be specified.

For more information about working with estimation data types, see Data Domains and Data Types in System Identification Toolbox.

`u`, `y` — Matrix-based estimation data
matrices | cell array of matrices

Estimation data, specified for SISO systems as a comma-separated pair of N_s-by-1 real-valued matrices that contain uniformly sampled input and output time-domain signal values. Here, N_s is the number of samples.

For MIMO systems, specify u,y as an input/output matrix pair with the following dimensions:

u — N_s-by-N_u, where N_u is the number of inputs.
y — N_s-by-N_y, where N_y is the number of outputs.

For multiexperiment data, specify u,y as a pair of 1-by-N_e cell arrays, where N_e is the number of experiments. The sample times of all the experiments must match.

For time series data, which contains only outputs and no inputs, specify [],y.

For more information about working with estimation data types, see Data Domains and Data Types in System Identification Toolbox.

`data` — Estimation data
`iddata` object

Estimation data, specified as an iddata object that contains time-domain input and output signal values.

You cannot use frequency-domain data for estimating bj models.

`[nb nc nd nf nk]` — Model orders and delays
vector of matrices

Vector of matrices with nonnegative integers that contain the orders and delays of the Box-Jenkins model, as the following list describes.

nb — Order of the B polynomial + 1, specified as an Ny-by-Nu matrix, where Ny is the number of outputs and Nu is the number of inputs.
nc — Order of the C polynomial + 1, specified as an Ny-by-1 matrix.
nd — Order of the D polynomial + 1, specified as an Ny-by-1 matrix.
nf — Order of the F polynomial + 1, specified as an Ny-by-Nu matrix.
nk — Input delay in units of samples, specified as an Nu-by-Ny matrix.

`init_sys` — Linear system
`idpoly` model | linear model | structure

Polynomial model that configures the initial parameterization of sys, specified as an idpoly model with the Box-Jenkins structure that has only B, C, D and F polynomials active.

bj uses the parameters and constraints defined in init_sys as the initial guess for estimating sys.

Use the Structure property of init_sys to configure initial guesses and constraints for B(q), F(q), C(q) and D(q).

To specify an initial guess for, say, the C(q) term of init_sys, set init_sys.Structure.C.Value to the guess value.

To specify constraints for, say, the B(q) term of init_sys:

set init_sys.Structure.B.Minimum to the minimum B(q) coefficient values
set init_sys.Structure.B.Maximum to the maximum B(q) coefficient values
set init_sys.Structure.B.Free to indicate which B(q) coefficients are free for estimation

You can similarly specify the initial guess and constraints for the other polynomials.

`opt` — Estimation options
`bjOptions` option set

Estimation options, specified as an bjOptions option set. Options specified by opt include:

Estimation objective
Handling of initial conditions
Numerical search method and the associated options

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'InputDelay',1

`InputName` — Input channel names
string | character vector | string array | cell array of character vectors

Input channel names, specified as a string, character vector, string array, or cell array of character vectors.

If you are using a timetable for the data source, the names in InputName must be a subset of the timetable variables.

Example: sys = bj(tt,__,'InputName',["u1" "u2"]) selects the variables u1 and u2 as the input channels from the timetable tt to use for the estimation.

`OutputName` — Output channel names
string | character vector | string array | cell array of character vectors

Output channel names, specified as a string, character vector, string array, or cell array of character vectors.

If you are using a timetable for the data source, the names in OutputName must be a subset of the timetable variables.

Example: sys = bj(tt,__,'OutputName',["y1" "y3"]) selects the variables y1 and y3 as the output channels from the timetable tt to use for the estimation.

`Ts` — Sample time
`1` (default) | positive scalar

Sample time, specified as the comma-separated pair consisting of 'Ts' and the sample time in the units specified by TimeUnit. When you use matrix-based data (u,y), you must specify Ts if you require a sample time other than the assumed sample time of 1 second.

To obtain the data sample time for a timetable tt, use the timetable property tt.Properties.Timestep.

Example: bj(umat1,ymat1,___,'Ts',0.08) computes a model with sample time of 0.08 seconds.

`InputDelay` — Input delays
0 (default) | positive integer vector | integer scalar

Input delays for each input channel, specified as the comma-separated pair consisting of 'InputDelay' and a numeric vector.

For continuous-time models, specify 'InputDelay' in the time units stored in the TimeUnit property.
For discrete-time models, specify 'InputDelay' in integer multiples of the sample time Ts. For example, setting 'InputDelay' to 3 specifies a delay of three sampling periods.

For a system with N_u inputs, set InputDelay to an N_u-by-1 vector. Each entry of this vector is a numerical value that represents the input delay for the corresponding input channel.

To apply the same delay to all channels, specify 'InputDelay' as a scalar.

`IODelay` — Transport delays
0 (default) | scalar | numeric array

Transport delays for each input/output pair, specified as the comma-separated pair consisting of 'IODelay' and a numeric array.

For continuous-time models, specify 'IODelay' in the time units stored in the TimeUnit property.
For discrete-time models, specify 'IODelay' in integer multiples of the sample time Ts. For example, setting 'IODelay' to 4 specifies a transport delay of four sampling periods.

For a system with N_u inputs and N_y outputs, set 'IODelay' to an N_y-by-N_u matrix. Each entry is an integer value representing the transport delay for the corresponding input/output pair.

To apply the same delay to all channels, specify 'IODelay' as a scalar.

You can specify 'IODelay' as an alternative to the nk value. Doing so simplifies the model structure by reducing the number of leading zeros in the B polynomial. In particular, you can represent max(nk-1,0) leading zeros as input/output delays using 'IODelay' instead.

`IntegrateNoise` — Use integrators in noise channel
`false(Ny,1)` (default) | logical scalar | logical vector

Flag to use integrators in the noise channels, specified as false(Ny,1), a logical scalar, or a logical vector of length Ny, where Ny is the number of outputs.

Setting IntegrateNoise to true for a particular output results in the model:

$y (t) = \frac{B (q)}{F (q)} u (t - n k) + \frac{C (q)}{D (q)} \frac{e (t)}{1 - q^{- 1}}$

Here, $\frac{1}{1 - q^{- 1}}$ is the integrator in the noise channel,e(t).

Output Arguments

collapse all

`sys` — Box-Jenkins polynomial model
`idpoly` object

Box-Jenkins polynomial model that fits the estimation data, returned as a discrete-time idpoly object. This model is created using the specified model orders, delays, and estimation options.

Information about the estimation results and options used is stored in the Report property of the model. Report has the following fields:

Report Field	Description
`Status`	Summary of the model status, which indicates whether the model was created by construction or obtained by estimation
`Method`	Estimation command used
`InitialCondition`	Handling of initial conditions during model estimation, returned as one of the following values: `'zero'` — The initial conditions were set to zero. `'estimate'` — The initial conditions were treated as independent estimation parameters. `'backcast'` — The initial conditions were estimated using the best least squares fit. This field is especially useful to view how the initial conditions were handled when the `InitialCondition` option in the estimation option set is `'auto'`.
`Fit`	Quantitative assessment of the estimation, returned as a structure. See Loss Function and Model Quality Metrics for more information on these quality metrics. The structure has these fields. `FitPercent` — Normalized root mean squared error (NRMSE) measure of how well the response of the model fits the estimation data, expressed as the percentage fitpercent = 100(1-NRMSE) `LossFcn` — Value of the loss function when the estimation completes `MSE` — Mean squared error (MSE) measure of how well the response of the model fits the estimation data `FPE` — Final prediction error for the model `AIC` — Raw Akaike Information Criteria (AIC) measure of model quality `AICc` — Small-sample-size corrected AIC `nAIC` — Normalized AIC `BIC` — Bayesian Information Criteria (BIC)
`Parameters`	Estimated values of model parameters
`OptionsUsed`	Option set used for estimation. If no custom options were configured, this is a set of default options. See `bjOptions` for more information.
`RandState`	State of the random number stream at the start of estimation. Empty, `[]`, if randomization was not used during estimation. For more information, see `rng`.
`DataUsed`	Attributes of the data used for estimation, returned as a structure with the following fields. `Name` — Name of the data set `Type` — Data type `Length` — Number of data samples `Ts` — Sample time `InterSample` — Input intersample behavior, returned as one of the following values: `'zoh'` — A zero-order hold maintains a piecewise-constant input signal between samples. `'foh'` — A first-order hold maintains a piecewise-linear input signal between samples. `'bl'` — Band-limited behavior specifies that the continuous-time input signal has zero power above the Nyquist frequency. `InputOffset` — Offset removed from time-domain input data during estimation. For nonlinear models, it is `[]`. `OutputOffset` — Offset removed from time-domain output data during estimation. For nonlinear models, it is `[]`.
`Termination`	Termination conditions for the iterative search used for prediction error minimization, returned as a structure with these fields. `WhyStop` — Reason for terminating the numerical search `Iterations` — Number of search iterations performed by the estimation algorithm `FirstOrderOptimality` — $\infty$ -norm of the gradient search vector when the search algorithm terminates `FcnCount` — Number of times the objective function was called `UpdateNorm` — Norm of the gradient search vector in the last iteration. Omitted when the search method is `'lsqnonlin'` or `'fmincon'`. `LastImprovement` — Criterion improvement in the last iteration, expressed as a percentage. Omitted when the search method is `'lsqnonlin'` or `'fmincon'`. `Algorithm` — Algorithm used by `'lsqnonlin'` or `'fmincon'` search method. Omitted when other search methods are used. For estimation methods that do not require numerical search optimization, the `Termination` field is omitted.

For more information on using Report, see Estimation Report.

`ic` — Initial conditions
`initialCondition` object | object array of `initialCondition` values

Estimated initial conditions, returned as an initialCondition object or an object array of initialCondition values.

For a single-experiment data set, ic represents, in state-space form, the free response of the transfer function model (A and C matrices) to the estimated initial states (x₀).
For a multiple-experiment data set with N_e experiments, ic is an object array of length N_e that contains one set of initialCondition values for each experiment.

If bj returns ic values of 0 and the you know that you have non-zero initial conditions, set the 'InitialCondition' option in bjOptions to 'estimate' and pass the updated option set to bj. For example:

opt = bjOptions('InitialCondition','estimate')
[sys,ic] = bj(data,[nb nc nd nf nk],opt)

The default 'auto' setting of 'InitialCondition' uses the 'zero' method when the initial conditions have a negligible effect on the overall estimation-error minimization process. Specifying 'estimate' ensures that the software estimates values for ic.

For more information, see initialCondition. For an example of using this argument, see Obtain Initial Conditions.

More About

collapse all

Box-Jenkins Model Structure

The general Box-Jenkins model structure is:

$y (t) = \sum_{i = 1}^{n u} \frac{B_{i} (q)}{F_{i} (q)} u_{i} (t - n k_{i}) + \frac{C (q)}{D (q)} e (t)$

where nu is the number of input channels.

The orders of Box-Jenkins model are defined as follows:

$\begin{matrix} n b : B (q) = b_{1} + b_{2} q^{- 1} + ... + b_{n b} q^{- n b + 1} \\ n c : C (q) = 1 + c_{1} q^{- 1} + ... + c_{n c} q^{- n c} \\ n d : D (q) = 1 + d_{1} q^{- 1} + ... + d_{n d} q^{- n d} \\ n f : F (q) = 1 + f_{1} q^{- 1} + ... + f_{n f} q^{- n f} \end{matrix}$

Alternatives

To estimate a continuous-time model, use:

tfest — returns a transfer function model
ssest — returns a state-space model
bj to first estimate a discrete-time model and convert it a continuous-time model using d2c.

References

[1] Ljung, L. System Identification: Theory for the User, Upper Saddle River, NJ, Prentice-Hall PTR, 1999.

Version History

Introduced before R2006a

expand all

R2022b: Time-domain estimation data is accepted in the form of timetables and matrices

Most estimation, validation, analysis, and utility functions now accept time-domain input/output data in the form of a single timetable that contains both input and output data or a pair of matrices that contain the input and output data separately. These functions continue to accept iddata objects as a data source as well, for both time-domain and frequency-domain data.

R2020b: Estimate and apply initial conditions

Initial-condition estimation was added for all estimated linear models, including transfer function and polynomial models. This capability eliminates the need to first convert models into state-space models to obtain and account for initial conditions when simulating the model. For more information, see the [sys,ic] = bj(__) syntax and the ic output argument descriptions.

R2018a: Advanced Options are deprecated for `SearchOptions` when `SearchMethod` is `'lsqnonlin'`

Specification of lsqnonlin- related advanced options are deprecated, including the option to invoke parallel processing when estimating using the lsqnonlin search method, or solver, in Optimization Toolbox™.

bj

Syntax

Description

Estimate Box-Jenkins Model

Configure Initial Parameters

Specify Additional Estimation Options

Return Estimated Initial Conditions

Examples

Identify SISO Box-Jenkins Model

Estimate Multi-Input, Single-Output Box-Jenkins Model

Estimate Box-Jenkins Model Using Regularization

Configure Estimation Options

Estimate MIMO Box-Jenkins Model

Obtain Initial Conditions

Input Arguments

tt — Timetable-based estimation data timetable | cell array of timetables.

u, y — Matrix-based estimation data matrices | cell array of matrices

data — Estimation data iddata object

[nb nc nd nf nk] — Model orders and delays vector of matrices

init_sys — Linear system idpoly model | linear model | structure

opt — Estimation options bjOptions option set

Name-Value Arguments

InputName — Input channel names string | character vector | string array | cell array of character vectors

OutputName — Output channel names string | character vector | string array | cell array of character vectors

Ts — Sample time 1 (default) | positive scalar

InputDelay — Input delays 0 (default) | positive integer vector | integer scalar

IODelay — Transport delays 0 (default) | scalar | numeric array

IntegrateNoise — Use integrators in noise channel false(Ny,1) (default) | logical scalar | logical vector

Output Arguments

sys — Box-Jenkins polynomial model idpoly object

ic — Initial conditions initialCondition object | object array of initialCondition values

More About

Box-Jenkins Model Structure

Alternatives

References

Version History

R2022b: Time-domain estimation data is accepted in the form of timetables and matrices

R2020b: Estimate and apply initial conditions

R2018a: Advanced Options are deprecated for SearchOptions when SearchMethod is 'lsqnonlin'

See Also

Topics

`tt` — Timetable-based estimation data
timetable | cell array of timetables.

`u`, `y` — Matrix-based estimation data
matrices | cell array of matrices

`data` — Estimation data
`iddata` object

`[nb nc nd nf nk]` — Model orders and delays
vector of matrices

`init_sys` — Linear system
`idpoly` model | linear model | structure

`opt` — Estimation options
`bjOptions` option set

`InputName` — Input channel names
string | character vector | string array | cell array of character vectors

`OutputName` — Output channel names
string | character vector | string array | cell array of character vectors

`Ts` — Sample time
`1` (default) | positive scalar

`InputDelay` — Input delays
0 (default) | positive integer vector | integer scalar

`IODelay` — Transport delays
0 (default) | scalar | numeric array

`IntegrateNoise` — Use integrators in noise channel
`false(Ny,1)` (default) | logical scalar | logical vector

`sys` — Box-Jenkins polynomial model
`idpoly` object

`ic` — Initial conditions
`initialCondition` object | object array of `initialCondition` values

R2018a: Advanced Options are deprecated for `SearchOptions` when `SearchMethod` is `'lsqnonlin'`