You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
optimization routine for function that is not smooth and also has a lot of local minimum
3 views (last 30 days)
Show older comments
xueqi
on 18 Feb 2013
Hi, is anyone can give me some suggestion about optimization routine for function that is not smooth and also has a lot of local minimum?
Answers (5)
Alan Weiss
on 19 Feb 2013
If you want to minimize a nonsmooth function, the general recommendation is to try patternsearch. If there are many local minima, the general recommendation is to use a variety of start points. If you have finite bounds on all components, lb and ub, then you can obtain random start points as follows:
x0 = lb + rand(size(lb)).*(ub - lb);
Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation
4 Comments
xueqi
on 20 Feb 2013
Hi. These are the result I have got when using fmincon...Firstly I thought that my function was not smooth, but now by looking at the first condition (always very big for several iterations...) I begin to think maybe it is just somehow that matlab just can't find the minimum...Am I right?
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
sumllh =
1.7499e+003
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 9 1.749900e+003 0.000e+000 3.433e+005
sumllh =
Inf
sumllh =
Inf
sumllh =
Inf
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
sumllh =
398.8533
User objective function returned Inf; trying a new point... 1 21 3.988533e+002 0.000e+000 8.908e+003 1.242e-001
sumllh =
4.0347e+003
sumllh =
1.4260e+003
sumllh =
647.5076
sumllh =
437.0004
sumllh =
4.0665e+003
sumllh =
1.4419e+003
sumllh =
643.5412
sumllh =
418.2376
sumllh =
365.8173
sumllh =
365.8173
sumllh =
365.8173
sumllh =
365.8173
sumllh =
365.8174
sumllh =
365.8173
sumllh =
365.8173
sumllh =
365.8173
sumllh =
365.8173
2 38 3.658173e+002 0.000e+000 7.766e+002 7.457e-002
sumllh =
597.1229
sumllh =
434.6045
sumllh =
368.9302
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
sumllh =
356.8667
3 50 3.568667e+002 0.000e+000 4.334e+002 5.845e-002
sumllh =
889.8367
sumllh =
516.4006
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
sumllh =
311.5622
4 61 3.115622e+002 0.000e+000 4.625e+002 5.322e-001
sumllh =
Inf
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
sumllh =
235.7194
User objective function returned Inf; trying a new point... 5 71 2.357194e+002 0.000e+000 3.726e+003 3.923e+000
sumllh =
Inf
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
sumllh =
206.5185
User objective function returned Inf; trying a new point... 6 81 2.065185e+002 0.000e+000 5.962e+002 2.708e+000
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
sumllh =
183.2263
7 90 1.832263e+002 0.000e+000 5.587e+002 6.503e-001
sumllh =
248.7550
sumllh =
190.8685
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
sumllh =
182.2106
8 101 1.822106e+002 0.000e+000 5.133e+002 6.867e-002
sumllh =
210.8628
sumllh =
183.1427
sumllh =
179.3117
sumllh =
179.3117
Alan Weiss
on 20 Feb 2013
It seems that your objective function returns its value whenever it is evaluated. The reason there are so many function evaluations between each line of the iterative display is that fmincon is taking small perturbations of the curent point and evaluating the objective at these new points. You see there are many times where the small perturbation in the objective leads to an Inf or a big departure from the current function value.
I conclude that your objective function is not finite at points very close to the current point after the first iteration.
Can you put some constraints on your problem so that the objective doesn't get so large? Or can you check to ensure that the objective function is defined correctly?
And one more thing. I see that you have tried to use GA to optimize your function. I strongly recommend that you use patternsearch instead--it is much faster and more reliable.
Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation
xueqi
on 20 Feb 2013
Thanks! So you think that the problem is not coming from the function being not smooth? I will try to add some constraints soon and see how it goes. But please stay with me in this post. I will tell you what happen and may still need your advice.
xueqi
on 1 Mar 2013
Yes. I do find that the value of the function is changing dramatically with respect to one variable. When near one particular number (which is the value that I want to estimate) it shows concavity. But as it gets far away from that particular value, the function goes to Inf quickly. I think it is why fmincon cant work correctly. But I can't really add proper constraint to this variable...What should I do?
Mohsen Davarynejad
on 18 Feb 2013
All the algorithms suitable for black-box optimization can be used. These algorithms, including Genetic algorithms (GA) and Particle swarm optimization (PSO), do not make any assumptions regarding the problem at hand, and thus they neither require the function to be convex, nor do they require the availability of the gradient of the function.
5 Comments
xueqi
on 18 Feb 2013
Yes I tried ga. But it just stopped at a local minimum. So I am wondering is that possible to combine the multistart with ga. But multistart is not supportive for ga...
Mohsen Davarynejad
on 18 Feb 2013
The combination is useful when you want to improve the exploitation of GA (when you are close to a good solution and you want to get as closer as possible to that possibly local optimal solution). In your case, it looks like you do not have enough exploration, thus the combination is of no use. You can improve the exploration of GA by increasing for instance the mutation rate.
Mohsen Davarynejad
on 18 Feb 2013
Share some information about that problem, the dimensions, and the way you code the problem in GA. That may helps to give you batter suggestions.
xueqi
on 20 Feb 2013
I have 8 independent variables to estimate. And this is the code I use for coding ga
if true
% A=[1 1 1 0 0 0 0 0];
b=[1];
lb=[0.1;0.1;0.1;0.0001;1.1;0.001;1.1;0.001];
ub=[1;1;1;1;Inf;1;Inf;1];
options=optimset('Algorithm','interior-point','Display','iter','MaxIter',50,'TolFun', 1e-1);
[x,fval,exitflag,output,lambda,grad] = fmincon(@beta,[0.2,0.2,0.2,0.01,100,0.1,100,0.1],A,b,[],[],lb,ub,[],options)
%x= ga(@beta,8,A,b,[],[],lb,ub,[],options)
end
xueqi
on 20 Feb 2013
It is just it seems take forever for ga to get the result....for my function beta, it takes nearly 1 minutes if I just assign a particular values to get the result. But when minimize beta, it takes several days even for just around 10 iterations. I am not sure that this is normal....
Juan Camilo Medina
on 2 Mar 2013
Edited: Juan Camilo Medina
on 2 Mar 2013
Based on my experience, the best way is to use simulated annealing, which is less heuristic than GA and it's powered by a Markov chain. fmincon() or any other gradient-based algorithm won't work well since the function is non-smooth.
[x,fval,exitFlag,output] = simulannealbnd(ObjectiveFunction,X0)
you can also try
x = fminsearch(fun,x0)
1 Comment
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
I mentioned this to you in another post, but I'll document it here as well. I think the best way to deal with an ill-behaved objective function is to replace it with a better one. I.e., question your initial reasoning behind this function and see if there are alternative formulations that you haven't considered.
If nothing else, it is suspicious that you think you absolutely must accept the non-smoothness of the function. If you've thought about the problem formulation carefully enough, there are almost always smooth approximations that you can find. The local minima are another matter.
If you can't see an alternative yourself, it may be time to walk us through the development of the function you're thinking of so we can see where you might have taken a wrong turn.
41 Comments
xueqi
on 3 Mar 2013
Thank you! I think my English level is the limitation that I explain my question clearly...Sorry about that. My objective funtion is smooth. I have plot the objective function with respect to each variable. They are all have a global minimum. The problem is when the the optimization routine is trying different point inside the constraint, it is easily to get a extremely big number when it is away from the optimal and I think that is why there are always a value of Inf shows...and just because of this the optimization routine fails. I have tried to optimize in terms of each variable and it works only if I set the lower bound and upper quite near the optimal. P.S. I know what is the optimal currently because I am using simulated data so I know what are the true values should be. But this is not the case when I begin to deal with the data from the experiment.
Matt J
on 3 Mar 2013
I think you need to write down the mathematical formula for the objective function (and constraints if any), so we can see what we're talking about.
xueqi
on 3 Mar 2013
Edited: xueqi
on 3 Mar 2013
This is the objective function
if true
% function [ sumllh] = beta1 (x)
DD=xlsread('parameters.xlsx');
rn=size(DD,1);
edw=DD(:,7);
lb1=x(1);
lb2=x(2);
lb3=x(3);
r=x(4);
sub =[lb1,lb2,lb3,r];
s1=x(5);
b1=x(6);
s2=x(7);
b2=x(8);
mu = maxmin(sub,DD); save mu; hatmu=xlsread('hatmu.xlsx'); % assign the simulated mu to hatmu %hatmu=simbeta(39);
for i=1:rn alpha1(i)= (b1*edw(i)/2+(1-b1)*mu(i,1))*(s1-1)/edw(i); beta1(i)= (edw(i)-b1*edw(i)/2-(1-b1)*mu(i,1))*(s1-1)/edw(i); alpha2(i)= (b2*edw(i)/2+(1-b2)*mu(i,2))*(s2-1)/edw(i); beta2(i)= (edw(i)-b2*edw(i)/2-(1-b2)*mu(i,2))*(s2-1)/edw(i); llh1(i,:)=betacdf((hatmu(i,1)+0.5)/edw(i,:),alpha1(i),beta1(i))-betacdf((hatmu(i,1)-0.5)/edw(i,:),alpha1(i),beta1(i)); llh2(i,:)=betacdf((hatmu(i,2)+0.5)/edw(i,:),alpha2(i),beta2(i))-betacdf((hatmu(i,2)-0.5)/edw(i,:),alpha2(i),beta2(i)); %llh(i,:)=log(llh1(i,:)*llh2(i,:)) llh(i,:)=llh1(i)*llh2(i); save llh end %llh(i,:)=log( mvnpdf(hatmu(i,:),mu(i,:),[s1 s12; s12 s2])) % mallh(i,:)=mvnpdf(hatmu(i,:),mu(i,:),[s1^2 s12; s12 s2^2]) % save mallh %save llh
sumllh_=sum(log(llh));
if (sumllh_==-Inf)
sumllh_=-1000;
end
%because we need to maximize this but fmincon is for minimization so we %need to minus it sumllh=-sumllh_ end
xueqi
on 3 Mar 2013
and this is all the constaint for all the parameters
if true
% A=[1 1 1 0 0 0 0 0];
b=[1];
lb=[0.1;0.1;0.1;0.0001;500;0.001;500;0.001];
ub=[1;1;1;1;Inf;0.6;Inf;0.6];
options=optimset('Algorithm','interior-point','Display','iter','MaxIter',20);
x0=[0.3,0.3,0.3,0.08,2000,0.05,2000,0.05];
[x,fval,exitflag,output,lambda,grad] = fmincon(@beta1,x0,A,b,[],[],lb,ub,[],options)
end
xueqi
on 3 Mar 2013
The objective funtion is for calculating the sum of log likelihood. I am assuming two variables as independent beta distribution(with some modification). hatmu is the simulated data(or experimental data). maxmin is a function used to calulate the two variables for different parameters-sub, which contains 4 parameters. So basically beta1 is to calculate the likelihood for the 4 parameters plus anther 4 parameters of noise. I want to maximize the sum of log likelihood to get the 8 parameters.
When fmincon is search for different values, it is easily to get a extremely small likelihood, so the log likelihood is going to Inf.
Matt J
on 3 Mar 2013
I'm still waiting for the formula for the objective function, as opposed to the code. Regardless, you haven't formatted the for-loop, so your objective function code is hard to read.
In any case, I have 2 remarks
- It looks like you are minimizing a loglikelihood when (if maximum likelihood estimation is the goal) you should be minimizing the negative of the loglikelihood.
- The Infs are probably coming from taking the log of the likelihood in regions of low probability. One way to eliminate them would be to minimize minus the likelihood, instead of minus the loglikelihood.
xueqi
on 3 Mar 2013
Edited: xueqi
on 3 Mar 2013
Yes I am using maximum likelihood estimation. And I add a minus to the sum of log likelihood and then minimize it. I am trying your remarks 2 now. The formulae is
if true
% llh1=betacdf(hatmu1+0.5)-betacdf(hatmu1-0.5);
llh2=betacdf(hatmu2+0.5)-betacdf(hatmu2-0.5)
llh=llh1*llh2;
sumllh= multiplication of llh for all the 39 set of data
end
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
I didn't think you were still using FMINCON. Everyone's been telling you to use simulated annealing and pattern search.
In any case, you should be constraining hatmu2>= -0.5, because
- That's the region where probabilities are >0 and so where the solution lies.
- When working with loglikelihoods, instead of likelihoods, that's where the objective function is defined.
Matt J
on 3 Mar 2013
BTW you've shown llh1 really commented out
% llh1=betacdf(hatmu1+0.5)-betacdf(hatmu1-0.5);
So if this is not the formula for llh1 that you're using, what formula are you using instead?
xueqi
on 3 Mar 2013
Edited: xueqi
on 3 Mar 2013
If I change the log of likelihood to likelihood, then fmincon just stops at the start point since all the point around the function value is equal to 0 so the gradient is equal to 0.
Is there anyway to improve the precision of matlab? I think it might assume a extremely small number to 0.
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
Keep up with my posts! I told you that you need to be using constraints that keep the algorithm away from this region.
Incidentally, is there any reason why you're using a finite difference of betacdf
llh=betacdf(hatmu+0.5)-betacdf(hatmu-0.5);
to approximate the beta distribution, when the density function is given exactly and trivially by
hatmu.^(a-1)*(1-hatmu)^(b-1)/beta(a,b)
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
Reprinting your function with more readable formatting
for i=1:rn
alpha1(i)= (b1*edw(i)/2+(1-b1)*mu(i,1))*(s1-1)/edw(i);
beta1(i)= (edw(i)-b1*edw(i)/2-(1-b1)*mu(i,1))*(s1-1)/edw(i);
alpha2(i)= (b2*edw(i)/2+(1-b2)*mu(i,2))*(s2-1)/edw(i);
beta2(i)= (edw(i)-b2*edw(i)/2-(1-b2)*mu(i,2))*(s2-1)/edw(i);
llh1(i,:)=betacdf((hatmu(i,1)+0.5)/edw(i,:),alpha1(i),beta1(i))-... betacdf((hatmu(i,1)-0.5)/edw(i,:),alpha1(i),beta1(i));
llh2(i,:)=...
betacdf((hatmu(i,2)+0.5)/edw(i,:),alpha2(i),beta2(i))-
betacdf((hatmu(i,2)-0.5)/edw(i,:),alpha2(i),beta2(i));
end
xueqi
on 3 Mar 2013
Edited: xueqi
on 3 Mar 2013
- I am trying patternsearch now
- hatmu2 is >=0.5 in my data set. Both hatmu1 and hatmu2 >=0 and <= edw(a specific number). Is this what you meaning here?
- I am using llh1=betacdf(hatmu1+0.5)-betacdf(hatmu1-0.5). Some of the comments are nonsense I just forget to delete it...
- Patternsearch seems a a little bit slow. Is there anyway to speed it up?
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
Now that I can read your function, I can make more complete recommendations.
First, I think you should go back to loglikelihoods.
Second, you also have to make sure that you have no hatmu that would cause betacdf to be zero. Remember, the beta distribution only has non-zero probability for data values between 0 and 1.
Third, it appears that the dependence on your unknown parameters are entirely through alpha1(i), alpha2(i), beta1(i), beta2(i) and this dependence is linear. You need to ensure that the alphas and betas are constrained to be non-negative, which is the assumption of the beta distribution. Since their dependence on the unknowns is linear, you can do this by specifying linear constraints.
Matt J
on 3 Mar 2013
Edited: Matt J
on 3 Mar 2013
hatmu2 is >=0.5 in my data set. Both hatmu1 and hatmu2 >=0 and <= edw(a specific number). Is this what you meaning here?
But that means that hatmu can be equal to or close to edw. When this happens
(hatmu+0.5)/edw >1
and the beta distribution is zero at such values for all parameter choices.
What is the purpose of the 0.5? Are you trying to compute finite difference derivatives of betacdf? As I said before, this is unnecessary and also might be creating problems.
Patternsearch seems a a little bit slow. Is there anyway to speed it up?
Forget patternsearch. Now that I can read your code and understand your problem, I think that FMINCON should be fine, once you apply the constraints that I've described.
xueqi
on 3 Mar 2013
What is the purpose of the 0.5?
Because in the experiment, subject can only choose integer. For example, if he wants to choose 76.4, but limited by the experiment requirement, he then needs to choose 76 instead. So the likelihood for 76 is actually from 76-0.5 to 76+0.5.
beta distribution
I am not using beta distribution exactly but use a modified version just because of the points 0 and edw. For beta distributin, at the point 0 and 1 the error should be 0, but this is not an proper distribution set up for the experiment. So I introduce the extra two parameters to solve this problem. I am not quite clear how to explain it now. I will think about now.
Second, you also have to make sure that you have no hatmu that would cause betacdf to be zero
When hatmu=edw, (hatmu(i,1)+0.5)/edw will be more than 1. But this is the cdf not the pdf, so the result should be equal to 1. And for hatmu=0, (hatmu(i,1)-0.5)/edw<0 so the result should be equal to 0. But since llh=(hatmu(i,1)+0.5)/edw-(hatmu(i,1)+0.5)/edw the total result should not be 0. So I think it is impossible for llh to get an exact 0 but may be a quite near 0 value.
Matt J
on 3 Mar 2013
OK, well then it seems like your Inf values are more due to letting alpha(i) and beta(i) run too close to zero (or less). Applying constraints to alpha(i) and beta(i) and maybe also using sqp, as I mentioned before, will probably help.
xueqi
on 5 Mar 2013
I have looked into this. It is the likelihood is easy to equal to 0. In theroy it shouldnt be. So I think is that matlab is not precise enough. It treats a really small number to 0. Is there anyway to improve the precision?
Matt J
on 5 Mar 2013
Edited: Matt J
on 5 Mar 2013
It doesn't sound like a precision problem. If you have a loglikelihood term that is equal to 0, it means that you are operating in a flat region of BETACDF, which means one of the following is true
- That (hatmu+0.5)/edw <=0 or equivalently that hatmu<=-0.5,
- That (hatmu-0.5)/edw >= 1 or equivalently that hatmu>=edw+.5
You must ensure that all the hatmu lie in the "legal" open interval (-.5, edw+.5). Otherwise, it's bad data, plain and simple.
xueqi
on 5 Mar 2013
Oh But I really don't understand. See this is my hatmu1 and hatmu2
if true
% 12 8
13 9
12 8
32 25
11 14
26 30
27 28
27 30
19 44
19 42
17 40
8 19
14 22
14 22
24 30
19 18
8 9
32 41
3 3
4 5
2 3
7 7
4 4
5 4
5 4
5 6
4 4
21 15
5 6
6 3
5 3
4 4
2 21
4 4
40 4
15 17
3 20
5 22
3 22
end
Matt J
on 5 Mar 2013
The hatmu numbers you've shown clearly satisfy the lower bound of -.5, but you haven't shown edw, so we don't know about the upper bound of edw+.5
xueqi
on 5 Mar 2013
Edited: xueqi
on 5 Mar 2013
x1=[0.3,0.3,0.26,0.08,2000,0.05,2000,0.05](the first 4 parameters decides mu and the next 4 is for add some noise for mu), I get mu=
6.8795 5.1597
7.6101 5.7076
7.9938 5.9954
27.2074 21.7659
7.9180 9.5016
22.5482 26.3062
22.5482 26.3062
24.2338 26.9264
17.0101 42.5253
16.3487 40.8717
15.5410 38.8526
6.6654 14.9972
13.0653 20.5312
13.2404 20.8064
24.1087 30.1359
16.1521 16.1521
1.6451 1.6451
20.8231 28.6318
0 0
0.2595 0.1946
0 0
1.5615 1.7350
0 0
0.6682 0.5011
0.4296 0.2685
1.5193 1.5193
0 0
17.5306 10.5184
2.1790 1.4752
1.8606 1.1629
0.2595 0.1946
0.2595 0.1946
0 17.0691
0.2595 0.1946
33.5384 2.2172
10.0319 12.1661
0 12.3521
3.0461 16.9758
0 17.0691
and hence alph1=
180.6209
194.4948
201.7820
566.6565
200.3422
478.1758
478.1758
510.1867
373.0055
360.4444
345.1069
176.5549
298.0915
301.4169
507.8112
356.7106
81.2166
445.4166
49.9750
54.9032
49.9750
79.6290
49.9750
62.6644
58.1336
78.8267
49.9750
382.8900
91.3545
85.3085
54.9032
54.9032
49.9750
54.9032
686.8855
240.4863
49.9750
107.8224
49.9750
and beta1=
1.0e+003 *
1.8184
1.8045
1.7972
1.4323
1.7987
1.5208
1.5208
1.4888
1.6260
1.6386
1.6539
1.8224
1.7009
1.6976
1.4912
1.6423
1.9178
1.5536
1.9490
1.9441
1.9490
1.9194
1.9490
1.9363
1.9409
1.9202
1.9490
1.6161
1.9076
1.9137
1.9441
1.9441
1.9490
1.9441
1.3121
1.7585
1.9490
1.8912
1.9490
and alpha2=
147.9594
158.3649
163.8303
463.3202
230.4157
549.5426
549.5426
561.3213
857.5512
826.1484
787.8046
334.7799
439.8723
445.0980
622.2702
356.7106
81.2166
593.7072
49.9750
53.6712
49.9750
82.9238
49.9750
59.4920
55.0741
78.8267
49.9750
249.7240
77.9891
72.0584
53.6712
53.6712
374.1265
53.6712
92.0810
281.0146
284.5482
372.3539
374.1265
and* beta2*
1.0e+003 *
1.8510
1.8406
1.8352
1.5357
1.7686
1.4495
1.4495
1.4377
1.1414
1.1729
1.2112
1.6642
1.5591
1.5539
1.3767
1.6423
1.9178
1.4053
1.9490
1.9453
1.9490
1.9161
1.9490
1.9395
1.9439
1.9202
1.9490
1.7493
1.9210
1.9269
1.9453
1.9453
1.6249
1.9453
1.9069
1.7180
1.7145
1.6266
1.6249
and the corresponding likelihood is llh1=
0.0002
0.0000
0.0210
0.0010
0.2188
0.0465
0.0039
0.1354
0.4009
0.2439
0.4309
0.2913
0.2709
0.2153
0.1532
0.1918
0.0000
0
0.4773
0.0259
0.5176
0.0000
0.0046
0.0008
0.0001
0.1032
0.0046
0.0603
0.5188
0.0055
0.0000
0.0259
0.5176
0.0259
0.0000
0.0006
0.4773
0.5614
0.4773
and llh2=
0.3898
0.1624
0.5693
0.0740
0.0039
0.0217
0.3372
0.0711
0.2187
0.2933
0.3066
0.0191
0.4106
0.3968
0.2163
0.4319
0.0000
0
0.4773
0.0000
0.4773
0.0000
0.0046
0.0886
0.0273
0.0006
0.0046
0.0044
0.0004
0.4127
0.6666
0.0175
0.0212
0.0175
0.4167
0.0013
0.0000
0.0007
0.0010
and the *sum of loglikehood *is equal to Inf.
xueqi
on 5 Mar 2013
Edited: xueqi
on 5 Mar 2013
It is not surprised that sum of loglikelihood is equal to Inf since llh1(18)=llh2(18)=0. But there is nothing abnormal for alpha1(18) beta1(18) alpha2(18) beta2(18) .Actually if I insert everything in the formulae for calculating llh1(18) I get
llh1(18)=betacdf((32+0.5)/100,445.4416,1553.6)-betacdf((32-0.5)/100,445.4416,1553.6)=0!
Matt J
on 5 Mar 2013
Edited: Matt J
on 5 Mar 2013
But there is nothing abnormal for alpha1(18) beta1(18) alpha2(18) beta2(18)
They look pretty abnormal to me. With alpha and beta on the order of 1000, it means you're working with a beta distribution that is a polynomial of degree 1000+. That's a pretty ill-behaved thing. As I said before, I think you need to be constraining your alphas and betas in some way
Also, I now see that your earlier claim, that the objective function is smooth, is not true.
mu = maxmin(sub,DD)
is not a differentiable function of x, assuming
maxmin(v)=[max(v), min(v)]
xueqi
on 5 Mar 2013
I think you need to be constraining your alphas and betas in some way
Yes I will look into alpha and beta.
Also, I now see that your earlier claim, that the objective function is smooth, is not true
Here maxmin is a just a function name given by me.
xueqi
on 5 Mar 2013
Ha...It is a long story. It finds the minimum for objective function in terms of several variables and then maximize objective function from the minimum in terms of other several variables.
I have plotted beta1 in terms of different variables. Except the fact sumllh is Inf when it is away from optimal, it seems quite smooth around the optimal.
Matt J
on 5 Mar 2013
it seems quite smooth around the optimal.
It needs to be smooth throughout the feasible region, not just at the optimum. Also, determining smoothness by inspection of plots can be tricky. For example, discontinuities can be hidden by undersampling. Also, FMINCON algorithms generally require functions to be twice continuously differentiable. Smoothness of 1st and higher derivatives are generally invisible graphically.
xueqi
on 7 Mar 2013
Matt. You advice really helps. Should I accept this answer? But it is not really the answer for this question here...
xueqi
on 9 Mar 2013
Edited: xueqi
on 9 Mar 2013
Not everything. But at least from the aspect of the Inf problem. Now I don't have inf when optimization routine is working. But for some simulation data it works perfectly, but for some it fails...I guess it is because of there are a lot of local minimum, so I am trying using multistart and set 1000 start points. Is there any suggestion for that?
I still think the function is smooth so still using fmincon...Though I tried pattersearch, it clearly only returns local minimum and multistart is not compatible with patternsearch.
P.S. how can i upload a graph? I think it might be easier for you to understand the objective function if I show you the plot....
Matt J
on 9 Mar 2013
No, the only thing that would help is to see maxmin itself. Also, how did you end up getting rid of the Infs?
xueqi
on 9 Mar 2013
Edited: xueqi
on 9 Mar 2013
No, the only thing that would help is to see maxmin itself Well, the maxmin is not my objective function(but it is the model I want to fit in), my objective function is the one calculating the likelihood. Do you suggest if maxmin is smooth then the likelihood function will also be smooth?
how did you end up getting rid of the Infs
just assign a very small number(1e-10)if one of the likelihood is equal to 0...
multi start start point Currently I am using a loop to choose different start point for fmincon. Is there anyway matlab can do the optimization for all the points at the same time. It would be much efficient...
Matt J
on 9 Mar 2013
Edited: Matt J
on 10 Mar 2013
Do you suggest if maxmin is smooth then the likelihood function will also be smooth
Yes. All other operations in your likelihood function look smooth. maxmin is the only question mark.
just assign a very small number(1e-10)if one of the likelihood is equal to 0
I doubt that's really solving anything. Sure, it replaces the infs with some large number, but it won't change the fact that the likelihood will be locally flat to within numerical precision for certain alpha, beta. I suspect that's why you're still getting stuck so often.
Matt J
on 10 Mar 2013
Edited: Matt J
on 10 Mar 2013
One idea would be, for large alpha, beta to write the loglikelihood in terms of Stirling's approximation of Beta(alpha,beta),
llh(x,alpha, beta)= alpha*log(x)+beta*log(1-x) + 0.5*log(2*pi)...
(alpha+beta-.5)*log(alpha+beta) - ...
(alpha-.5)*log(alpha) - ...
(beta-.5)*log(beta) - ...
xueqi
on 12 Mar 2013
There are something wrong with the maxmin function. Maybe it is the casue of all the problems...
See Also
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)