Nonlinear optimization / fmincon algorithm mathematics-level question

2 views (last 30 days)
I am building an fmincon-style solver. It needs to be able to solve symbolic expressions with nonlinear equality and inequality constraints.
My solver uses a fairly basic Newton-Raphson gradient-descent active set algorithm to solve a set of KKT conditions.
In fmincon, nonlcon is used as a parameter for the nonlinear constraints. Me myself, I have only made a solution for linear and quadratic constraints. If I want to move onto nonlinear constraints, is it okay for me to assume that a gradient-descent algorithm can handle nonlinear expressions? In other words, can I just add nonlinear equality constraints to the equality constraints vector that I have for linear/quadratic constraints; and can i also add nonlinear inequality constraints to the inequality constraints vector that I already have?
I hope to get some insights here. I have tried it with some examples and so far it seems to work but I am not sure if it holds up for all nonlinear expressions.
If anyone needs more information to help me, feel free to ask and I will try to answer as precise as possible. Thank you.

Accepted Answer

Walter Roberson
Walter Roberson on 13 Feb 2021
You need different handling for nonlinear constraints.
For linear equality or inequality constraints, you can use the information from the numerically-encoded constraints to decide whether a proposed position would be within range, and if not to use the linear information to "back off" to a point that would be in range or on the boundary. You do not need to evaluate the objective function at the new position or evaluate any function at the new position: you can calculate the boundaries fairly directly.
For nonlinear constraints, the constraint function is a "black box": you pass it a proposed position and it replies "yes" or "no", but it does not give you enough information to be able to predict where the boundary is. Provided that the nonlinear constraints were well-coded, you might be able to use a gradient estimation from them to figure out and approximate direction to the boundary, but finding the boundary can be hard. That is especially difficult for nonlinear equality constraints: it can be difficult to find any points in the valid region.
  3 Comments
Walter Roberson
Walter Roberson on 14 Feb 2021
No, not exactly.
Suppose you are at point X0 and you have a proposal to use point X1, where X1 has been deduced by an analysis of local slope and a current step size. You could go ahead and evaluate the objective function F(X1), but F might be expensive -- for example it might require invoking an internet connection to a remote program for a 10 minute computation.. and you might be charged money for that. So you do not want to invoke F unless you know that X1 is within boundaries.
If you have linear inequalities A, b, you can calculate A*X1.' < b and with just a few calculations without invoking F, you can validate that it is or is not within the linear bounds.
If it is not within bounds, if rank(A) == length(X1) then you can use A\b to find exactly where the bounds are (and that only has to be done once, since it would not change.) If rank(A) < length(X1) then you can possibly carefully use null() to find linear combinations towards the boundaries. If rank(A) > length(X1) then you can use A\b to find approximate boundaries but you would need to verify any one proposed X1.
These techniques allow you to reduce the number of calls to F(X1) .
If you have nonlinear equalities NON(x) then... you don't know how expensive NON(x) is. Maybe it is just simple mathematical calculations that are fairly fast. But maybe NON(x) has to call out to that same expensive external program. Maybe it only has to call out to there sometimes (so probing its execution speed with some test points might not tell you the actual cost.) . Maybe it is even more expensive than F(x) itself is. So you would prefer not to call NON(x) if you do not have to.
When you do call NON(X1) then it gives you a pass/fail. It is "legal" to code things like
c(1) = x(3).^2 < 7.281
which oddly enough codes a constraint x(3).^2 >= 7.281 . You might only get 0 (false) and 1 (true) results, which do not give you any information about where would be valid to look.
But perhaps the user instead coded
c(1) = x(3).^2 - 7.281
which would be non-positive if x(3).^2 <= 7.281 and positive if x(3).^2 > 7.281 and the amount it is positive or negative gives you a hint about which direction to head to the boundary. With some other trial points you can estimate the gradient. You might not bother to estimate the gradient if all(c(:) <= 0) only if any(c(:) > 0) but when you do, in this situation the c values help figure out where the boundary is. In the situation where NON(x) does not invoke any expensive calcuations, this could be preferable to invoking F(x) ... and remember that even if you do invoke F(x) if it returns a finite value you will likely need to invoke NON(x) anyhow to validate the position before accepting the value of F(x) as contributing to the gradient information.
You do not know ahead of time how the user has coded NON(x): it might be fast and useful, it might be slow and not very useful, it might depend upon the exact location...
You would not bother to evaluate the nonlinear constraints for any point that fails the linear constraints: the linear constraints are much more likely to be fast and cheap than the nonlinear constraints are. It isn't that you can't, it is that there is unlikely to be any benefit to doing so.
If you do not have any linear constraints... then you just go ahead and call the nonlinear constraints. But the more your code can reason about the non-linear constraints without actually invoking the nonlinear constraint function, the less you have to worry about the possibility that the non-linear constraint function might be expensive (or not well coded.)
If you have nonlinear equalities then it becomes especially important to reason about them.
The flip side of all of this, since you are writing your own functions, is that you could potentially offer your own nonlinear constraints system that did not involve invoking a "black-box" function. For example it would be valid to support only polynomial (well, multinomial) constraints, as those could be reasoned about and you would not have to worry about expensive external software being called.
I hope this makes more clear that you cannot "just add nonlinear equality constraints to the equality constraints vector" -- they have very different representations and different handling is needed for them.
D zepp
D zepp on 17 Feb 2021
Thank you. This majorly improved my insight. I adjusted my algorithm and it seems to work more or less. Mostly just some runtime issues.

Sign in to comment.

More Answers (0)

Categories

Find more on Get Started with Optimization Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!