Is there a limit to the amount of data MBC toolbox can handle?
Show older comments
I'm currently creating a model of an engine using the model based calibration toolbox which has 9 inputs. To get a good fit for the data I've included 10000 data points. Currently the model has been "Building response model"... for several hours. Is there an upper limit to the amount of data it can handle, or will it eventually converge? I don't mind letting the model run for days as long as I get a good fit out of it at the end!
Thank you
3 Comments
Ian Noell
on 30 Mar 2016
Hi Nicholas,
What version of MATLAB and MBC model type are you using? I would expect MBC to be able to handle this size dataset but the fit time will depend on the model type, the fit options you are using and the amount of memory you have.
In particular, there are some options for Gaussian Process Models (available from R2015b) for fitting larger datasets that could be used to improve the fitting time. The default GPM fit algorithm changes when the dataset has more than 2000 points. You can find details about this in the documentation for fitrgp. I can also make suggestions about fitting other model types such as RBFs if needed.
Let me know if you have any further questions,
Ian
Nicholas
on 31 Mar 2016
Ian Noell
on 31 Mar 2016
Hi Nicholas,
You need R2015b to use GPM. If you use RBF's you can try and use the Advanced button on the Model Setup dialog. There is help for the advance options at:
Some useful options include:
Maximum number of centers: min(nObs/3,1000)
Percentage of data to be used as centers: min(100,(2000/nObs)*100)
For a dataset of 10000 these defaults result in 2000 points selected at random being considered as centers from which 1000 centers will be chosen. You could try reducing the percentage to be equivalent to 1000 points: min(100,(1000/nObs)*100).
Other options that you could explore is to reduce the number of trials , reduce the number of zooms, change the lambda algorithm directly.
Feel free to message me directly if you want more advice on this.
Ian
Answers (2)
Ian Noell
on 15 Apr 2016
2 votes
After discussing this question offline with Nicholas, we identified that the fitting of convex hull boundary models was taking a very long time with large data sets and number of inputs. Fitting a convex hull boundary model occurs by default from R2014b. You can uncheck the Fit boundary model option in the Fit Models dialog or wizard.
In R2015b we changed the default boundary model to be pairwise convex hulls when there are more than 10 inputs or more than 2000 data points. The R2015b release notes provide details.
Categories
Find more on Model-Based Calibration Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!