How to avoid hessian being 0 in fminunc(maximum likelihood estimation)?
5 views (last 30 days)
Hi, I will try to be brief as possible.
I am trying to minimize a function, , given initial value as below.
x0 = [log(104), log(0.999), log(log(166)), log(.499^2), log(.055^2),...
[x,fval,grad,hessian] = fminunc(likelihood(D,T,x(1),x(2),x(3),x(4),x(5),x(6),x(7)),x0)
The target function above, , is as below; three problemtic parts are denoted as and and . You don't have to look at the code now.
function LL = likelihood(D,T,aC,aBETA,aMU,aSIGMAU2,aSIGMAE2,M0,M1)
C = exp(aC);
BETA = exp(aBETA);
MU = exp(aMU);
SIGMAU2 = exp(aSIGMAU2);
SIGMAE2 = exp(aSIGMAE2);
result = mccall(T,C,BETA,MU,SIGMAU2,SIGMAE2,M0,M1);
%(1)% wbar = result.wbar;
wbar(wbar<=0) = eps(0);
wbar2 = [wbar, eps(0)];
wbar = wbar2;
sigth = (SIGMAU2 + SIGMAE2)^(0.5);
sigu = (SIGMAU2)^(0.5);
rho = sigu/sigth;
pr = @(t) normcdf(M0+M1*t);
%(2)% eta = @(t) log(wbar(62+t)) - MU;
theta = @(i) log(D(i,3)) - MU;
e_inside = @(t,i) (eta(t) - rho*(sigu/sigth)*theta(i))*(sigu*(1-rho^2)^0.5)^-1;
ue_p = @(t) pr(t)*normcdf(eta(t)/sigu) + 1-pr(t);
em_p = @(t,i) pr(t)*(1 - min(.999999,normcdf(e_inside(t,i))))*sigth^(-1)*normpdf(theta(i)/sigth)*D(i,3)^(-1);
LL = 0;
for i = 1:34
for t = 1:D(i,2)
LL = LL + log(ue_p(t));
for i = 35:141
for t = 1:D(i,1)-1
LL = LL + log(ue_p(t));
LL = LL + log(em_p(D(i,1),i));
%(3)% LL = LL*(-1);
My problem: So I have 7 parameters but my hessian matrix comes out as below; first parameter C and second parameter are 0 and also its covariance with other parameters.
I know why this happens; it is because my alorithm is..
(a) Use all the 7 parameters to get and save it as . So becomes a fixed vector.
(b) Use that and get
(c) With in hand, use 5 parameters(excluding C, ) to calculate .
So C, affects and this affects which affects . But doesn't recognize that it is a implicit function of C, ; because C, are used in (a) but are not used in (c). This is why hessian matrix is 0 for those two parameters.
How can I revise my algorithm so that my could recognize C, as a implicit variable? I tried chain rule, syms, or any conceivable method for a few days but couldn't address this issue. Any feedback or comments would be very much appreciated. Thanks.
Matt J on 19 May 2022
Operations in the objective function like this,
are technically illegal, since min(a,b) is not differentiable at a==b which breaks the assumptions of the Optimization Toolbox's derivative-based solvers (e.g. fminunc).
Moreover min(0.999000, f(x)) will be constant (and therefore have zero Hessian) in the region , which I'm guessing is the reason for what you are seeing.