mlecov
Asymptotic covariance of maximum likelihood estimators
Syntax
Description
returns an approximation to the asymptotic covariance matrix of the maximum
likelihood estimators of the parameters for a distribution specified by the
custom probability density function acov
= mlecov(params
,data
,'pdf',pdf
)pdf
. The output
acov
is a p-by-p
matrix, where p is the number of parameters in
params
.
mlecov
computes a finite difference approximation to the
Hessian of the loglikelihood at the maximum likelihood estimates
params
, given the observed data
, and
returns the negative inverse of that Hessian.
specifies options using one or more name-value arguments in addition to any of
the input argument combinations in previous syntaxes. For example, you can
specify the censored data and frequency of observations.acov
= mlecov(___,Name,Value
)
Examples
Custom Probability Density Function (pdf)
Load the sample data.
load carbig
The vector Weight
contains the weights of 406 cars.
Define a custom function that returns the pdf of a lognormal distribution. Save the file in your current folder as lognormpdf.m
.
function newpdf = lognormpdf(data,mu,sigma)
newpdf = exp((-(log(data)-mu).^2)/(2*sigma^2))./(data*sigma*sqrt(2*pi));
Estimate the parameters mu
and sigma
of the custom distribution.
[phat,pci] = mle(Weight,'pdf',@lognormpdf,'Start',[4.5 0.3])
phat = 1×2
7.9600 0.2804
pci = 2×2
7.9327 0.2611
7.9872 0.2997
Compute the approximate covariance matrix of the parameter estimates.
acov = mlecov(phat,Weight,'pdf',@lognormpdf)
acov = 2×2
10-3 ×
0.1937 -0.0000
-0.0000 0.0968
Estimate the standard errors of the estimates.
se = sqrt(diag(acov))'
se = 1×2
0.0139 0.0098
The standard error of the estimates of mu and sigma are 0.0139 and 0.0098, respectively.
Recalculate the confidence intervals pci
from the standard error se
by using the Wald method (normal approximation).
alpha = 0.05; probs = [alpha/2; 1-alpha/2]; pci2 = norminv(repmat(probs,1,numel(phat)),[phat; phat],[se; se])
pci2 = 2×2
7.9327 0.2611
7.9872 0.2997
Custom Log Probability Density Function (pdf)
Define a custom function that returns the log pdf of a beta distribution. Save the file in your current folder as betalogpdf.m
.
function logpdf = betalogpdf(x,a,b)
logpdf = (a-1)*log(x)+(b-1)*log(1-x)-betaln(a,b);
Generate sample data from a beta distribution with parameters 1.23 and 3.45, and estimate the parameters using the simulated data.
rng('default') % For reproducibility x = betarnd(1.23,3.45,25,1); phat = mle(x,'Distribution','beta')
phat = 1×2
1.1213 2.7182
Compute the approximate covariance matrix of the parameter estimates.
acov = mlecov(phat,x,'logpdf',@betalogpdf)
acov = 2×2
0.0810 0.1646
0.1646 0.6074
Custom Negative Loglikelihood Function
Load the sample data.
load('readmissiontimes.mat')
The data includes ReadmissionTime
, which has readmission times for 100 patients. This data is simulated.
Define a custom negative loglikelihood function for a Poisson distribution with the parameter lambda
, where 1/lambda
is the mean of the distribution. You must define the function to accept a logical vector of censorship information and an integer vector of data frequencies, even if you do not use these values in the custom function.
custnloglf = @(lambda,data,cens,freq) ... - length(data)*log(lambda) + sum(lambda*data,'omitnan');
Estimate the parameter of the custom distribution and specify its initial parameter value (Start
name-value argument).
phat = mle(ReadmissionTime,'nloglf',custnloglf,'Start',0.05)
phat = 0.1462
Compute the variance of the parameter estimate.
acov = mlecov(phat,ReadmissionTime,'nloglf',custnloglf)
acov = 2.1374e-04
Compute the standard error.
sqrt(acov)
ans = 0.0146
Specify Right-Censored Data
Load the sample data.
load('readmissiontimes.mat');
The data includes ReadmissionTime
, which has readmission times for 100 patients. The column vector Censored
contains the censorship information for each patient, where 1 indicates a right-censored observation, and 0 indicates that the exact readmission time is observed. This data is simulated.
Define a custom log probability density function (pdf) and log survival function for a Weibull distribution with the scale parameter lambda
and the shape parameter k
. When the data contains censored observations, you must pass both the log pdf and log survival function to mle
and mlecov
.
custlogpdf = @(data,lambda,k) ...
log(k) - k*log(lambda) + (k-1)*log(data) - (data/lambda).^k;
custlogsf = @(data,lambda,k) - (data/lambda).^k;
Estimate the parameters of the custom distribution for the censored sample data. Specify the initial parameter values (Start
name-value argument) for the custom distribution.
phat = mle(ReadmissionTime,'logpdf',custlogpdf,'logsf',custlogsf, ... 'Start',[1,0.75],'Censoring',Censored)
phat = 1×2
9.2090 1.4223
The scale and shape parameters of the custom distribution are 9.2090 and 1.4223, respectively.
Compute the approximate covariance matrix of the parameter estimates.
acov = mlecov(phat,ReadmissionTime, ... 'logpdf',custlogpdf,'logsf',custlogsf,'Censoring',Censored)
acov = 2×2
0.5653 0.0102
0.0102 0.0163
Input Arguments
params
— Parameter estimates
vector
Parameter estimates, specified as a vector. These parameter estimates must be maximum
likelihood estimates. For example, you can specify parameter estimates
returned by mle
.
Data Types: single
| double
data
— Sample data and censorship information
vector | two-column matrix
Sample data and censorship information used to estimate the distribution
parameters params
, specified as a vector of sample data
or a two-column matrix of sample data and censorship information.
You can specify the censorship information for the sample data by using
either the data
argument or the
Censoring
name-value argument.
mlecov
ignores the
Censoring
argument value if
data
is a two-column matrix.
Specify data
as a vector or a two-column matrix
depending on the censorship types of the observations in
data
.
Fully observed data — Specify
data
as a vector of sample data.Data that contains fully observed, left-censored, or right-censored observations — Specify
data
as a vector of sample data, and specify theCensoring
name-value argument as a vector that contains the censorship information for each observation. TheCensoring
vector can contain 0, –1, and 1, which refer to fully observed, left-censored, and right-censored observations, respectively.Data that includes interval-censored observations — Specify
data
as a two-column matrix of sample data and censorship information. Each row ofdata
specifies the range of possible survival or failure times for each observation, and can have one of these values:[t,t]
— Fully observed att
[–Inf,t]
— Left-censored att
[t,Inf]
— Right-censored att
[t1,t2]
— Interval-censored between[t1,t2]
, wheret1
<t2
mlecov
ignores NaN
values in
data
. Additionally, any NaN
values in the censoring vector (Censoring
) or frequency
vector (Frequency
) cause
mlecov
to ignore the corresponding rows in
data
.
Data Types: single
| double
pdf
— Custom probability density function
function handle | cell array
Custom probability distribution function (pdf), specified as a function handle or a cell array containing a function handle and additional arguments to the function.
The custom function accepts a vector containing sample data, one or more individual distribution parameters, and any additional arguments passed by a cell array as input parameters. The function returns a vector of probability density values.
Example: @newpdf
Data Types: function_handle
| cell
logpdf
— Custom log probability density function
function handle | cell array
Custom log probability density function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.
The custom function accepts a vector containing sample data, one or more individual distribution parameters, and any additional arguments passed by a cell array as input parameters. The function returns a vector of log probability values.
Example: @customlogpdf
Data Types: function_handle
| cell
nloglf
— Custom negative loglikelihood function
function handle | cell array
Custom negative loglikelihood function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.
The custom function accepts the following input arguments, in the order listed in the table.
Input Argument of Custom Function | Description |
---|---|
params | Vector of distribution parameter values
params . |
data | Sample data. The data value is a
vector of sample data or a two-column matrix of sample data
and censorship information. |
cens | Logical vector of censorship information.
nloglf must accept
cens even if you do not use the
Censoring name-value argument. In
this case, you can write nloglf to ignore
cens . |
freq | Integer vector of data frequencies.
nloglf must accept
freq even if you do not use the
Frequency name-value argument. In
this case, you can write nloglf to ignore
freq . |
trunc | Two-element numeric vector of truncation bounds.
nloglf must accept
trunc if you use the
TruncationBounds name-value
argument. |
nloglf
can optionally accept the additional arguments
passed by a cell array as input parameters.
nloglf
returns a scalar negative loglikelihood value
and, optionally, a negative loglikelihood gradient vector (see the
GradObj
field in the Options
name-value argument).
Example: @negloglik
Data Types: function_handle
| cell
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: 'Censoring',cens,'Options',opt
instructs mlecov
to read the censored data information from the vector cens
and
perform according to the new options structure
opt
.
cdf
— Custom cumulative distribution function
function handle | cell array
Custom cumulative distribution function (cdf), specified as a function handle or a cell array containing a function handle and additional arguments to the function.
The custom function accepts a vector containing sample data, one or more individual distribution parameters, and any additional arguments passed by a cell array as input parameters. The function returns a vector of cdf values.
For censored or truncated observations, you must define both
cdf
and pdf
. For fully
observed and untruncated observations, mlecov
does not use cdf
. You can specify the censorship
information by using either data
or
Censoring
and specify the truncation bounds by
using TruncationBounds
.
Example: 'cdf',@newcdf
Data Types: function_handle
| cell
logsf
— Custom log survival function
function handle | cell array
Custom log survival function, specified as a function handle or a cell array containing a function handle and additional arguments to the function.
The custom function accepts a vector containing sample data, one or more individual distribution parameters, and any additional arguments passed by a cell array as input parameters. The function returns a vector of log survival probability values.
For censored or truncated observations, you must define both
logsf
and logpdf
. For
fully observed and untruncated observations,
mlecov
does not use
logsf
. You can specify the censorship
information by using either data
or
Censoring
and specify the truncation bounds by
using TruncationBounds
.
Example: 'logsf',@logsurvival
Data Types: function_handle
| cell
Censoring
— Indicator of censored data
vector of 0s (default) | vector consisting of 0, –1, and 1
Indicator of censored data, specified as a vector consisting of 0, –1,
and 1, which indicate fully observed, left-censored, and right-censored
observations, respectively. Each element of the
Censoring
value indicates the censorship status
of the corresponding observation in data
. The
Censoring
value must have the same size as
data
. The default is a vector of 0s, indicating
all observations are fully observed.
You cannot specify interval-censored observations using this argument.
If the sample data includes interval-censored observations, specify
data
using a two-column matrix.
mlecov
ignores the
Censoring
value if data
is
a two-column matrix.
For censored data, you must define the custom distribution by using
pdf
and cdf
,
logpdf
and logsf
, or
nloglf
.
mlecov
ignores any NaN
values in the censoring vector. Additionally, any NaN
values in data
or the frequency vector
(Frequency
) cause
mlecov
to ignore the corresponding values
in the censoring vector.
Example: 'Censoring',censored
, where
censored
is a vector that contains censorship
information.
Data Types: logical
| single
| double
TruncationBounds
— Truncation bounds
[-Inf,Inf]
(default) | vector of two elements
Frequency
— Frequency of observations
vector of 1s (default) | vector of nonnegative integer counts
Frequency of observations, specified as a vector of nonnegative
integer counts that has the same number of rows as
data
. The j
th element of the
Frequency
value gives the number of times the
j
th row of data
was
observed. The default is a vector of 1s, indicating one observation per
row of data
.
mlecov
ignores any NaN
values in this frequency vector. Additionally, any
NaN
values in data
or the
censoring vector (Censoring
) cause
mlecov
to ignore the corresponding values
in the frequency vector.
Example: 'Frequency',freq
, where
freq
is a vector that contains the observation
frequencies.
Data Types: single
| double
Options
— Numerical options
statset('mlecov')
(default) | structure
Numerical options for the finite difference Hessian calculation,
specified as a structure returned by statset
.
The mlecov
function interprets the following
statset
options.
Field Name | Description |
---|---|
GradObj | Flag indicating whether the function provided
by the |
DerivStep | Relative step size used in the finite
difference for Hessian calculations, specified as a
vector of the same size as
The
default value is |
Example: 'Options',statset('GradObj','on')
Data Types: struct
More About
Censorship Types
mlecov
supports left-censored, right-censored, and interval-censored observations.
Left-censored observation at time
t
— The event occurred before timet
, and the exact event time is unknown.Right-censored observation at time
t
— The event occurred after timet
, and the exact event time is unknown.Interval-censored observation within the interval
[t1,t2]
— The event occurred after timet1
and before timet2
, and the exact event time is unknown.
Double-censored data includes both left-censored and right-censored observations.
Survival Function
The survival function is the probability of survival as a function of time. It is also called the survivor function.
The survival function gives the probability that the survival time of an individual exceeds a certain value. Because the cumulative distribution function F(t) is the probability that the survival time is less than or equal to a given point t in time, the survival function for a continuous distribution S(t) is the complement of the cumulative distribution function: S(t) = 1 – F(t).
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)