Genetic Algorithm Convergence problem: big range of best fit square errors

6 views (last 30 days)
Dear Matlab experts,
I have a model of ODEs: 15 equations, 20 parameters and about 100 experimental data points.
I wanted to fit my model to the experimental data to find the minimum squared error between the data and the model.
To find the global minimum, I am trying to use the GA (genetic algoritm) but I think there is some issue with the convergence.
For example, if I run my fitting procedure with exactly the same settings many times, I will have some distribution of the final (best fitted) squared errors ranging from 0.001 up to 15.
I assume that the model didn't converge for those cases with the large SE.
At the same time, I realised that when I change the settings in GA, the situation improves but there is still a big difference between the obtained SEs.
Therefore, I was wondering:
1. Is there any optimal set of the GA optimisation settings (population size, tolfun etc) that anyone can use for any optimisation problem? Or should it be fine-tuned every time for each particular task?
2. Is there any strategy to fine-tune the search settings in GA?
3. What are the main settings to look for?
Your suggestions would be greatly appreciated.
Many thanks,
Ildar

Answers (1)

John D'Errico
John D'Errico on 11 May 2020
Is there any optimal set? No. If there was some optimal set of algorithmic parameters, do you think there is any reason they would not have set it as the default? I can see them discussing things now - Lets let people guess! We will make them work for the right answer here, when we know the optimal way to drive this code. Be serious.
Is there any magical strategy to insure you get better answers? Not really.
Essentially, you need to force the algorithm to search longer, search harder, search broader. This improves the chances it will stumble on the best solution. If you know of constraints or bounds that should be pertinent, THEN USE THEM! You need to reduce the search space as much as possible.
But GA is just a stochastic search, one based on a genetic metaphor. As such, it is one we hope might work well, because genetics has done such interesting and useful stuff over billions of years. But there is no absolute assurance here. In fact, you might notice that the original genetic algorithm has not yet converged. Some might even argue we are the consequence of an evolutionary mistake - a locally suboptimal solution that may one day be replaced by a better one. :-)
Regardless, 100 data points, with 20 parameters? That is probably pushing the limits, since that is a huge search space to sift through. You may have barely enough useful information content in those 100 points to push the algorithm into the solution you want to see. And that is probably why you are seeing poor convergence behavior.
  1 Comment
Alan Weiss
Alan Weiss on 12 May 2020
Of course, what John said is absolutely correct. There are other approaches beside ga that have the potential of saving you a lot of time, or equivalently, allowing you to do far more searches in the same time as ga. I would try a combination of MultiStart and lsqcurvefit or lsqnonlin, as explained in MultiStart Using lsqcurvefit or lsqnonlin. But with this many dimensions and this few data points, well, as John said, you have set yourself a difficult task.
Alan Weiss
MATLAB mathematical toolbox documentation

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!