Reading variables from a big matlab Table

3 views (last 30 days)
Hello There we have about 45 big matlab Tables (average size 12 GByte each). Please note that thee are .mat files. We will be running a script with a for loop that will read only a couple of rows from each of theses tables and do some easy computation on the obtained data. Our problem is that on matlab r2017a takes about 25 minutes average to load each table to the matlab environment, a lot of time considering that we are just interested to extract the data of a couple of rows of those tables. As far as we understand the command matfile is not working with tables (is this correct?) to be able to extract variables from the file without loading. What are our options here? If any! Thank you in advance
  1 Comment
João Nunes
João Nunes on 28 Mar 2018
Hi there, I am also interested in knowing the answer to this question if anybody can help. In my project I need to handle very large tables as well but it takes too much time and I only need to acess two or three variables of the entire table. I also tried to use matfile('file_name.mat') but without sucess... Is there any other solution that I do not know about?

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 28 Mar 2018
Yes, matfile won't help since it can't partially load a table.
Unfortunately, as you are now there is no workaround I'm aware of. You could always contact Mathworks support directly to see if they have a solution.
If the data is to be accessed more than once, then what I'd do is convert all these tables to Tall Tables, then save them with the write function. As far as I know there is no way to convert a regular mat file to one compatible with tall arrays so You'd have to take the performance hit once when you load the tables for the first time as regular table, but subsequent loads would allow you efficiently load only the data your require as you'd be using tall arrays.
  2 Comments
Guillaume
Guillaume on 28 Mar 2018
Paramonte's comment mistakenly posted as an answer moved here:
Thanks for the replies. I have looked on the tall tables, using the datastore command, but if I am not wrong datastrore does not deal with mat files. Only csv, image , txt, etc seems to be supported. Not mat files. Is this correct? regards
Guillaume
Guillaume on 28 Mar 2018
As said, you would have to
  • load your tables once, as standard tables. As far as I know there's no way around that. So this is going to take a long time.
  • convert the standard table into a tall table with the tall function
  • save the tall table with the write. This will save the tall array in a different kind of mat file (not suitable for load)
At this point, you can reopen the new mat files as a TallDatastore which will allow you faster access to parts of the tables. But you'll have to go through these 3 initial steps first.

Sign in to comment.

More Answers (1)

Paramonte
Paramonte on 28 Mar 2018
Guillaume, thank you
I have gone through all the steps you mentioned. Creating the tall data was time consuming t_tall=tall(T).
our table T has 30 entries (rows) ,so after doing the write('c\mydir',t_tall) I ended up with 30 separate matfiles named sequentially array_r1_0000x..., so all the table rownames lost. Just too messy and potentially leading to a chaos if you have about 300 tables to sort out.
Looks like mathworks concept for big data is processing excel files with numerous entries and the sort. They don´t seem to support their very own file format: the *.mat file format!

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!