How to add a constraint when using fsolve?

Hi, I'm currently trying to solve a system of five nonlinear equations using fsolve. However, I know that fsolve doesn't really allow you to add constraints. I need the fifth variable to be less than or equal to 24, but I don't even know where to even begin to get this problem solved. I also need all my variables to be greater than or equal to zero, but that's been a problem I've been able to solve by making my starting points around 10 instead of zero. To give greater context, these equations are all partial derivatives that are then equated to zero.
And here are the solutions:
However, the last variable (the one equal to 95.8047) can't be greater than 24. Please help! I'd really appreciate it! Thank you!

4 Comments

Hi, if you want x(5) less than 24, there will be no exact real number solution
@Sabrina,
Please post your code as text, rather than as images. That way, we can more conveniently copy/paste and run it.
x0=[10,10,10,10,10]
fun0=@(x)[-0.003*(x(1))^2+(x(1)+x(2)-14)^(-1);
-0.003*(x(2))^2+(x(1)+x(2)-14)^(-1);
0.95*((-0.003)*(1-x(5)/96)*(x(3))^2+(x(3)+x(4)-14)^(-1));
0.95*((-0.003)*(1-x(5)/96)*(x(4))^2+(x(3)+x(4)-14)^(-1));
-2+0.001/48*(x(3))^3];
fsolve(fun0,x0)
Here's the code, sorry about posting it as an image the first time. Alex, how do you know that there will be no exact real number solution? How could you tell right away? Does this mean I need to change my actual objective function that I'm taking partials from? Thank you!

Sign in to comment.

 Accepted Answer

I ended up installing the Global Optimization Toolbox, which allowed me to run a Global Search. I did end up changing some parameters in my model to make it run better which is why some of the numbers are a little different from the original question post. Here is the code:
gs = GlobalSearch;
minmin = @(x)(-(-0.001*x(1)^3-0.001*x(2)^3+2*10*log(x(1)+x(2)-14)-x(5)^0.7-x(6)^0.7+0.95*(-0.001*(1-x(5)/40)*x(3)^3-0.001*(1-x(5)/40)*x(4)^3)+2*10*log(x(3)+x(4)-14)));
problem = createOptimProblem('fmincon','x0',[10,10,10,10,24,24],...
'objective',minmin,'lb',[0,0,0,0,0,0],'ub',[Inf,Inf,Inf,Inf,24,24]);
x = run(gs,problem)
And here were my results:
Thank you so much Matt and Walter for helping me with this!

More Answers (2)

The fifth equation involves only x(3) and can be solved immediately
x3=(2*48/0.001)^(1/3)
x3 = 45.7886
The 3rd and 4th equations then imply that x4=x3
x4=x3
x4 = 45.7886
and can be used to solve for x5,
x5=1-(-1./(x3+x4-14)*96/0.95/-0.003/x3^2)
x5 = 0.7929
which satisfies the constraint x5<=24.
Comparison of the first two equations implies that x1=x2=z, which reduces the problem to the single variable root finding problem
0.003*z^2 = (2*z-14)^(-1)
which can be solved with fzero() to give,
x1=x2=9.0396

3 Comments

Hi Matt,
Thanks so much for going through the calculations. I hadn't even realized that the fifth equation only had one term. The odd thing is that when I've run fsolve, x5 usually is around 24, so it's strange that it's so low in your solution, though I can see how your math works. In my model, those figures don't really make much sense within the context, and I'm not too certain that this is a symmetrical solution and that x3 would equal x4. Is this solution just one out of many local minima?
If we subtract the 3rd equation from the 4th equation, we obtain,
0.95 * (-0.003) * (1-x(5)/96) * ( x(4)^2 -x(3)^2 ) = 0
In order for this equation to be satisfied, either x(5)=96, which you say is forbidden, or x(4)=x(3).
Thanks so much! That clears things up!

Sign in to comment.

residue = @(x)sum(fun0(x).^2)
Now use an optimizer to minimize residue using lb all 0, and ub all inf except 24 for the appropriate entry.
Caution: this approach does not guarantee that you will find a zero, and can need to be tested with multiple initial conditions, such as by using a MultiStart from Global Optimization Toolbox

12 Comments

Hi Walter, Thanks so much for your help! I copied your residue formula and put it into Optimize using upper and lower bounds. And the Optimizer actually did run! Just to clarify, does the residue function just square fun0? And now I'm just taking the minimum value of fun0?
I'm also very new to using the Optimizer, and in this tutorial that I watched, they have their objective function as:
f = 100*(y - x^2)^2 + (1 - x)^2;
Their objective function is just in one line. However, my fun0 is basically a series of equations, so how does that change things? By finding the minimum of fun0 instead of using the objective function that fun0 is based on, does that mean MATLAB is helping me find solutions where all my equations are equal to zero?
One last question: is using the Global Optimization Tolbox really necessary? I'm using the student license, and I think I'd have to pay extra for that, so if there's any way around it, I'd be really happy.
Sorry for all the questions, and I really appreciate your help!
does the residue function just square fun0
No, it takes the sum() of the squares. This is a "least-squared" solution to the system. It is one of the ways of balancing the requirements of the various individual equations .
If you have two equations, A and B that should ideally both be 0, then at any point, x, that satisfies A(x) and B(x) are both exactly 0, A(x)^2 + B(x)^2 will in fact be 0 because the individual elements being squared will both be 0. Now suppose that you find an x1 such that A(x1) is 2 and B(x1) is 0, and you find another point x2 such that A(x2) is 1 and B(x2) is 1, then how do you decide whether x1 or x2 is the "better" point? In both cases, x1 and x2, the total of the individual distances is the same, 2+0 = 1+1, so does that mean that both points are equally good?
Let us consider the case of having 1000 equations, and you have two points, one of which gives 1000 for the first equation and 0 for the other equations, and the other gives 1 for all of the equations: then it becomes obvious that the solution for the first equation with the first prospective point is Probably Not Good even if the others are perfect, and that for the second prospective point, that whatever error there is is well distributed, not favoring one equation over the others. With the 1000 equation system, it becomes clear that if the equations are equally important, the solution that gives error 1 for each of them (total 1 * 1000 = 1000) is almost certainly better than the solution that gives error 1000 for the first but is exact for the others. But how to quantify that? Answer: take the sum of squares of the errors. 1000^2 + 999*0 = 1000000, vs 1^2 * 1000 = 1000 . The case in which the error was well distributed between the equations wins easily when you use sum-of-squares.
sum-of-squares effectively tries to minimize the total distance to the solution -- the solution that is least "distance" from perfect is the one that wins. Euclidean distance is sqrt() of sum-of-squares, so the location that hast least sum-of-squares has the least distance.
The Global Optimization toolbox is not necessary; it is just convenient.
You should be careful not to trust that the first answer you get back from fmincon() is not necessarily going to be the best possible solution. fmincon() is a local optimizer: it uses algorithms to try to find the best solution "near" the starting point. It gets stuck in local minima.
For example if you have two "wells" beside each other then once fmincon happens to try "inside" one of the wells, it is not likely to ever try looking inside the other. When you look at a graph of the two-well system, it might seem obvious to you that it "should" have a look inside the second well, but it is not obvious to fmincon. Consider the case where you have a very large shallow bowl that has a well at some point in it, and that the second well is dug into the side of the bowl starting at a much higher elevation -- fmincon cannot somehow immediately know where that second well even is . Imagine the case where you have an extinct volcano inside a lake: the lake has a lowest point somewhere that is the "minima" outside the old volcano, but there just might happen to be an old lava tube opening inside the volcano that leads deeper... possibly much deeper.
No fixed algorithm like fmincon can reliably find the global minima of complicated systems. Algorithms that use randomness might be able to find global minima in the limiting case (run them for centuries.)
So, do not trust the first answer from fmincon... not unless it just happens to solve your equations perfectly!
Instead, you should invoke fmincon several times with different starting points... and take the best of the solutions.
Also, there are a number of shapes for which fminsearch() is better about getting into the right general area, even though it might not be the best for finding the best possible local solution.
Hi Walter,
Thanks. I get what you're saying. I'm getting vastly different solutions depending on my starting points so I kind of see why the Global Optimization Toolbox is useful. I ended up downloading the free trial for it, and have been reading about the different solvers.
Do you have a suggestion for which solver to use? I'm guessing it depends on how my objective function is set up? Right now, I currently have two methods: using one long function with all my variables in it, and one function that is a series of partial derivatives (the one I originally posted on this forum)
Here are the two different methods using code if that helps:
Method 1: minimizing one long function
function f = objectiveFcn(optimInput)
a1x = optimInput(1);
a1y = optimInput(2);
a2x = optimInput(3);
a2y = optimInput(4);
ix = optimInput(5);
iy = optimInput(6);
f = -(-0.001*a1x^3-0.001*a1y^3+2*log(a1x+a1y-14)-ix-iy+0.95*(-0.001*(1-ix/48)*a2x^3-0.001*(1-ix/48)*a2y^3)+2*log(a2x+a2y-14))^2;
end
Method 2: using the partial derivatives, then minimizing the residue
fun0=@(x)[-0.003*(x(1))^2+2*(x(1)+x(2)-14)^(-1);
-0.003*(x(2))^2+(x(1)+2*x(2)-14)^(-1);
0.95*((-0.003)*(1-x(5)/48)*(x(3))^2+2*(x(3)+x(4)-14)^(-1));
0.95*((-0.003)*(1-x(5)/48)*(x(4))^2+2*(x(3)+x(4)-14)^(-1));
-1+0.001/48*(x(3))^3+0.001/48*(x(4))^3];
residue = @(x)sum(fun0(x).^2)
Will both methods work? Or is one intrinsically better than the other? With method 1, I hit all of my upper bounds when I use surrogateopt and patternsearch (those two seemed the most appropriate). With method 2, I'm getting a solution that doesnt make sense when I use surrogateopt and optimizer won't run when I use patternsearch.
Thanks again, I am so grateful for your help!
I do not have any experience with the surrogate search.
I did a bit of work with patternsearch a few years ago, but never became a convert. If I recall correctly, I ended up taking a copy of the patternsearch code and modifying it so that after an fmincon was done, that the position could be updated from the fmincon result (patternsearch permits you to run a local minimizer like fmincon every few generations, but at least at the time I was testing it, patternsearch only used the result to inform the error bounds, not to move the current point.) I did find that there were some kinds of problems where my modified version worked well, with the patternsearch component providing logic about "general areas" to search, and the fmincon updates zooming in towards solutions much more efficiently.
Hi Walter, thanks for letting me know, I'll just do some trial and error then!
I know that John D'Errico has different opinions on optimization than I do, based upon the kinds of functions that he has had to optimize, but for the functions that I have had to optimize, the tools I tend to use are:
  • fmincon and related fixed-algorithm minimizers; these are efficient at finding local minima, but they get stuck in local minima and cannot get out of them
  • fminsearch: this is significantly better at getting out of local minima than fmincon, but it is not the best at finding the precise local minima. Also, it can fail on any case where you have an asymptope descent from a central area where the minima is inside the central area, as it can end up chasing the every-decreasing outer slope, moving away from the actual solution. Which is generally a problem with minimizers.
  • ga() or particleswarm(): these are heuristic algorithms that use randomness as part of the search, and do not require that the functions be continuous or have continuous derivatives. They can get over some kinds of local minima that others cannot -- but they can be quite slow. For example in one Travelling Salesman test I did, in over 100,000 iterations it was unable to notice the right pair of values to exchange to get the best possible answer. Simulated annealing also fits in this class, but I rarely use it
  • tools such as MultiStart are useful for automatically trying a lot of different starting points for other algorithms. I do not tend to use them in practice, tending to instead run for loops
  • I wrote my own "brute force" tool based on fmincon and fminsearch. It is not fast, but afterwards I am left feeling more confident that I did not make any "obvious" mistakes in the search
I ended up using GlobalSearch, and I'm about 70% sure that it's giving me the correct answer... The only thing is that in the video tutorial I watched for it, it seems to include the global minimum and local minima in the solutions. And I've just been getting just one solution, which I'm assumings is the global minimum.
I tried to do a cross-check by using the fmincon with different initial points using the sum of least squares as well, but that method brought me to a different solution. I plugged the solutions I got from GlobalSearch and fmincon and the GlobalSearch gave me a lower value for the minimum, which is why I think gong with GlobalSearch is the right way. However, it is weird that the solution found my fmincon, which I'm assuming would be a local minima, isn't included in the results from GlobalSearch.
Anyways, I'll include my code in the answer post, and hopefully this will help someone else who is just as clueless as I am about MATLAB. I'd appreciate it if you took a look at it, but I'm already so grateful for the help you've given me!
Matt J
Matt J on 15 Dec 2020
Edited: Matt J on 15 Dec 2020
And I've just been getting just one solution, which I'm assumings is the global minimum.
You shouldn't assume that unless the least squares objective value given by the solution is zero. We know the global minimum for the least squares objective is zero because the solution in my answer achieves that. In fact, the solution in my answer is provably the unique global minimum, so unless your x(i) and mine nearly coincide, the GlobalSearch output could not be a global minimum.
However, it is weird that the solution found my fmincon, which I'm assuming would be a local minima, isn't included in the results from GlobalSearch.
GlobalSearch is just a tool for running the optimization multiple times with many different starting points. There is never any guarantee that it will find the global minimum, nor is there any guarantee that all local minima will be found.
Hi Matt, thanks for letting me know to also check the least squares objective value. Unfortunately, the answer I got using GlobalSearch can't be compared to the one you calculated because I had to make a few adjustments. Most importantly, the fifth equation no longer has just x(3) in it, but x(5) is also a part of it. I used the code Walter gave me to get the sum of least squares and tried to minimize it using the finmincon. However, the solution generated by GlobalSearch is still better, despite the solution from GlobalSearch not matching up with the solution that is generated by minimizing the sum of least squares.
Basically, I'm trying to maximize the function:
f= @(x)(-(-0.001*x(1)^3-0.001*x(2)^3+2*10*log(x(1)+x(2)-14)-x(5)^0.7-x(6)^0.7+0.95*(-0.001*(1-x(5)/40)*x(3)^3-0.001*(1-x(5)/40)*x(4)^3)+2*10*log(x(3)+x(4)-14)))
Using GlobalSearch, I use the code:
rng default % For reproducibility
gs = GlobalSearch;
min3 = @(x)(-(-0.001*x(1)^3-0.001*x(2)^3+2*10*log(x(1)+x(2)-14)-x(5)^0.7-x(6)^0.7+0.95*(-0.001*(1-x(5)/40)*x(3)^3-0.001*(1-x(5)/40)*x(4)^3)+2*10*log(x(3)+x(4)-14)));
problem = createOptimProblem('fmincon','x0',[10,10,10,10,24,24],...
'objective',min3,'lb',[0,0,0,0,0,0],'ub',[Inf,Inf,Inf,Inf,24,24]);
x3 = run(gs,problem)
Since I'm trying to maximize the original function, I put a negative sign in front of the whole function since the GlobalSearch was finding the global minimum. This got me :
x3 =
17.6726 17.6730 23.2407 23.2405 24.0000 0.0000
Since x(1) and x(2) are close in value, I just made them equal to each other when I put them back in my original equation. Same for x(3) and x(4). I subbed these values back into my original function:
2*(-0.001*17.6727^3)+20*log(2*17.6727-14)-(24)^0.7-(0)^0.7+0.95*(2*(-0.001*(1-24/40)*23.2407^3)+20*log(2*23.2407-14))
and got: 97.5198
Using the method of minimizing least squares, I took the partial derivatives of my original function and turned them into a series of eqns in the function fun3:
fun3=@(x)[-0.003*(x(1))^2+20*(x(1)+x(2)-14)^(-1);
-0.003*(x(2))^2+20*(x(1)+x(2)-14)^(-1);
0.95*((-0.003)*(1-x(5)/40)*(x(3))^2+20*(x(3)+x(4)-14)^(-1));
0.95*((-0.003)*(1-x(5)/40)*(x(4))^2+20*(x(3)+x(4)-14)^(-1));
-0.7*x(5)^(-0.3)+0.001/40*(x(3))^3+0.001/40*(x(4))^3;
-0.7*x(6)^(-0.3)];
x0=[15 15 15 15 24 24]
x1=[Inf Inf Inf Inf 24 24]
residue = @(x)sum(fun3(x).^2)
I took the residual of of fun3, and then tried to minimize it using finmincon in the optimize toolbox. I used x0 as my inital points, 0 as the lower bounds, and x1 as the upper bounds.
This got me the solution:
disp(solution)
17.6727 17.6727 19.1403 19.1403 10.0215 24.0000
I subbed these values back into my original function:
2*(-0.001*17.6727^3)+20*log(2*17.6727-14)-(10.0215)^0.7-(24)^0.7+0.95*(2*(-0.001*(1-24/40)*19.1403^3)+20*log(2*19.1403-14))
and got: 91.1827
So, I understand mathematically why I should be looking for the solution that minimizes the sum of least squares, but I don't understand why my solution from minimizing the residual gives me a different solution from the GobalSearch, and why the GlobalSearch solution ultimately gave me a higher value for my original function. Am I doing something wrong with the set up? Sorry this was a rather long comment! I just really want to be able to trust my solutions so that I can do some analysis from them!
Matt J
Matt J on 15 Dec 2020
Edited: Matt J on 15 Dec 2020
Setting partial derivatives to zero is only a valid optimality condition for unconstrained problems. It is no longer valid once you impose bounds. So, GlobalSearch is more successful than your fun3 formulation for two reasons. First, it is guiding its search with proper optimality conditions. Second, it has mechanisms which help it avoid local minima that fmincon alone does not.
Okay, thank you so much for clearing that up! I feel a lot better about using GlobalSearch now!

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!