After about an hour and a half of waiting, the MATFILE command returned an object without any memory errors on my Linux machine. Now when I try to access individual elements in the MAT file, it appears that the whole file has to be loaded into memory again, even for small informational data fields in the file. I am going to need a different approach. Unless someone can suggest a workaround, I am going to have to abandon using the MATFILE command.
accessing large MAT file
46 views (last 30 days)
Show older comments
I am trying to access data stored in a large MAT file. The file is 72G of Simulink sim data.
Now, obviously I cannot use the LOAD command on my laptop with 16G of RAM. I thought the reason MathWorks provides the MATFILE command was to allow for accessing large MAT files without loading them.
But that doesn't seem to be the case.
When I attempt to access the file using the MATFILE command, Matlab behaves as if it were loading all that data into memory. My memory utilization goes to 98%, I get an out of memory error, and then Matlab silently crashes and exits.
So I go back to my big linux machine that I used to run Simulink and create this file, and run the MATFILE command there. And indeed it looks like Matlab is loading the whole file into RAM. I am hoping to divide the file up there into separate MAT files, but it is taking a really long time to load this data, and also using all available RAM.
Which leads to my questions: What is the MATFILE command doing? Is this expected behavior? Am I stuck rerunning my simulations and putting all results into separate MAT files? How are truely huge datasets stored and manipulated in Matlab? Evidently it is not with MAT files...
Thanks.
Accepted Answer
Sara Nadeau
on 11 Nov 2019
I believe you are having trouble with the matfile function because of the format of the logged data.
If you logged the data in Simulink using Dataset format (default format for several releases), you can create Simulink.SimulationData.DatasetRef objects that reference the data in the file without loading it into memory. To access and manipulate data for individual signals, you can create matlab.io.datastore.SimulationDatastore objects.
These additional topics may be helpful for guiding you through creating and using DatasetRef and SimulationDatastore objects:
- Load Big Data for Simulation (illustrates steps for creating objects)
- Analyze Big Data from a Simulation
I hope this helps!
More Answers (1)
Guillaume
on 11 Nov 2019
See the limitations section of matfile to see what it can and can't do. In particular, the granularity of matfile is typically at the variable level. I.e you can select which variables to load, however apart from numerical matrices, if you load a variable you load all of it.
It's unclear what's in your mat file but it sounds like it's objects, perhaps just one object, in which case you won't benefit much from matfile.
2 Comments
Guillaume
on 11 Nov 2019
It's not designed for objects unfortunately, it's designed for accessing large numerical matrices.
Since you have such a large mat file I assume you're using the 7.3 format. This format is based on HDF5, which you can read using various functions. I've no idea if that would make reading the file easier and you'd have to figure out the data structure yourself as mathworks do not document their format.
See Also
Categories
Find more on Sources in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!