How to save .mat files on the disk after each iteration?

I am facing difficulties how to save the data as struct after each loop.
I have huge image matrix and its respective labels matrix ,I want to save them after each iteration in a folder, and then delete them after each iteration from workspace, any suggestion?
Please note that I want to save the data in the disk and clear them after each iteration and save

 Accepted Answer

The values will be over-written in each iteration, so you don't need to clear them.
for k = 1:number_of_iterations
%get images and labels for this iteration
%save them as MAT files, the variable names should be provided as a string or a char vector
%As you have mentioned that there is a large data for each iteration
%use the (MAT File) version 7.3
save("images"+k+".mat", "images", "-v7.3")
save("labales"+k+".mat", "labels", "-v7.3")
end

23 Comments

@Dyuman Joshi how to create a folder with a specific name to save them into it?
"how to create a folder with a specific name to save them into it? "
@Dyuman Joshi also I read that the Variables Should Not Be Named Dynamically, instead we have to use structure to save them, Do you have any idea about this matter please?
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Stephen23 I was reading your TUTORIAL: Why Variables Should Not Be Named Dynamically
Will the above solution be efficient in case of huge dataset?
I assume you want to make a new folder in the current directory -
%Make a new folder
mkdir FolderName
%Change the current directory to the new folder
cd FolderName
"also I read that the Variables Should Not Be Named Dynamically, instead we have to use structure to save them"
Yes, it's best to not store them dynamically.
"Do you have any idea about this matter please?"
What do you want to do with the saved data?
Do you want to do some Image Processing with the image data?
"I was reading your TUTORIAL: Why Variables Should Not Be Named Dynamically"
It is unclear how that relates to your question. Are you trying to SAVE some MAT files each with unique variable names? That would be ... bad data design.
Much better: simply use the same variable name (and SAVE any meta-data together in the same file).
"Will the above solution be efficient in case of huge dataset?"
Some users asking questions on this forum think that their 1000x1000 matrices are "huge". Some people think that their 20 Terabyte datasets are "huge". You can read here what different MAT file versions support:
Note v7.3 uses data compression which of course takes more time than no data compression.
"...and then delete them after each iteration from workspace, any suggestion? "
In most cases CLEARing (and other attempts by users to micro-manage MATLAB's memory management) is counter-productive and inefficient. Dyuman Joshi is most likely correct to advise that you do not need to clear variables. But ... it really does depend on what you are doing and how much memoery you have.
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Dyuman Joshi yes I want to use them in some machine learning techniques after that,
After saving the data I want to use ImageDataStore and tall arrays to deal with data in CNN and some other techniques
If you want to use imageDatastore, why store the data as .mat files?
You can write the image data as images and use the images accordingly.
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Stephen23 I mean by huge that in each iteration I generate image by size around 120*120*1*7000 double and lable 1*7000, I have 30 iterations and I want to repeat that for 15 run
I still cant reach for a good solution to deal with these data some suggest imagedatastore that's why I want to save them first in the disk then try imds and tall arrays
It is unclear how that relates to your question. Are you trying to SAVE some MAT files each with unique variable names? That would be ... bad data design.
Is not that what @Dyuman Joshi suggest?
Much better: simply use the same variable name (and SAVE any meta-data together in the same file).
How to use the same variable name, this will overwrite the data ? and finally we will get the final iteration only?
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Dyuman Joshi I dont want them as image, I need them as .mat to do some computations
You appear to be overthinking this. What you are doing boils down to this:
A = rand(120,120,1,7000);
save test.mat A
I just ran it now on this thread. It works. Repeat for each array. Name the MAT files as you wish. Of course importing and exporting (with the corresponding data compressing and decompression) takes a finite amount of time.
Then use a DATASTORE.
Note that MAT files do not seem to be a supported image file format of IMAGEDATASTORE.
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
save test.mat A
by this I will save only the final iteration? because in each iteration A will overwrite the previous A?
How to use it in for loop? Any suggestion please?
Note that MAT files do not seem to be a supported image file format of IMAGEDATASTORE.
in another thread it was written that
The image data store has a ReadFcn property that lets you define your own file read function, so any format that you can read should be supportable.
I will try to use DATASTORE instead, but after finding good solution how to save the data!
"by this I will save only the final iteration?"
Yes, which is exactly why I wrote "Name the MAT files as you wish."
"How to use it in for loop? Any suggestion please?"
See Dyuman Joshi's answer from over one hour ago:
M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Dyuman Joshi your method worked, thanks, but now I am facing a problem, When I want to load some matrices and combine them in a matrix ,all of their names are the same, how can I overcome this issue? thanks
"When I want to load some matrices and combine them in a matrix ,all of their names are the same, how can I overcome this issue?"
Having all of the names the same is good: it means that you can write simple, efficient, robust code.
Remember to always LOAD into an output variable (which is a scalar structure):
S = load(..);
and then access its fieldnames if required. Then simply use indexing to store the imported data (either the structure S or any of its fields) in an array, e.g. a numeric array, a cell array, a structure array:
M
M on 15 Nov 2023
Edited: M on 15 Nov 2023
@Stephen23 actually I didnt get how to apply this, I have multiple folders, the .mat files inside are with different names, each folder contains two varibales, each with the same name (Images_A, Labels_A) but with different .mat names
The other folders have the same concept, (Images_B, Labels_B) but with different .mat names.. and so on ..
"I have multiple folders, the .mat files inside are with different names, each folder contains two varibales, each with the same name (Images_A, Labels_A) but with different .mat names"
That sounds like something the DIR could handle quite easily. What have you tried so far?
"The other folders have the same concept, (Images_B, Labels_B) but with different .mat names..."
So you have "multiple folders" and also "the other folders". What exactly are "the other folders" , how are they distinguished from the "multiple folders" ?
You do not provide suffiicent information to give clear and complete advice (or testable, working code).
How to LOAD these MAT files is probably better discussed in one of your other recent threads, e.g.:
As discussed in that last thread, your data will (probably) not all fit into memory at once.
M
M on 15 Nov 2023
Edited: M on 15 Nov 2023
@Stephen23 I am not finding clear answers in the previous threads,
I wish you can give me a good advise, because I consumed a lot of time searching for a proper solution but you and @Dyuman Joshi solutions starting to work with me.
what I meant is that I have 16 folders, each folder contains 60 files, 30 Images 30 Labels, the .mat files have different names but the variables inside have the same name.
For example folder1
30 variables with Images_A name, and 30 variables with labels_A name but their .mat files name are all different
folder2
30 variables with Images_B name, and 30 variables with labels_B name but their .mat files name are all different
and so on
I cant find clear method to deal with this data properly , I want to use this data in SVD and NN but before that I want to do some computations as I mentioned in the previous thread
"what I meant is that I have 12 folders, each folder contains 60 files, 30 Images 30 Labels, the .mat files have different names but the variables inside have the same name."
That sounds like something DIR would easily handle.
"For example folder1 .. 30 variables with Images_A name, and 30 variables with labels_A name but their .mat files name are all different .. folder2 .. 30 variables with Images_B name, and 30 variables with labels_B name but their .mat files name are all different"
Folders contain files and other folders. Folders do not contain variables (these only exist in MATLAB memory).
It would probably be simpler and faster if you:
  1. posted a screenshot of your File Explorer clearly showing the file structure,
  2. use WHOS() to show us a list of the content of several of the MAT files.
M
M on 15 Nov 2023
Edited: M on 15 Nov 2023
@Stephen23 I want to pick up from each folder 15 files of images (4D double) and their respective files of labels and combine the images and combine the labels
I cant do that since the name inside each .mat file is the same
P = 'absolute or relative path to the parent directory';
S = dir(fullfile(P,'*Data','*.mat'));
If you want the folder and filenames sorted into alphanumeric order then either 1) add sufficient lrading zeros to the folder/filenames or 2) sort them yourself, e.g. by downloading NATSORTFILES:
S = natsortfiles(S); % DOWNLOAD https://www.mathworks.com/matlabcentral/fileexchange/47434-natural-order-filename-sort
S contains all filenames together with their folders and other file meta-data. You could loop over all files:
for k = 1:numel(S)
F = fullfile(S(k).folder,S(k).name);
D = load(F);
.. do whatever with D (except concatenate them into a 64 GB array)
end
Of course you could also filter S for specific file/folder names. Or modify the DIR match expression to only match "labels" or "TRO" at the start of the filenames... whatever.
"I cant find clear method to deal with this data properly , I want to use this data in SVD and NN but before that I want to do some computations as I mentioned in the previous thread"
If this is the same data discussed in your other threads then you (probably) cannot fit all of that data into memory. So there is likely no point in trying to concatenate all of those arrays together. But you could supply those filenames to DATASTORE (of some kind) and use tall arrays to process it.... if you have sufficient time to spare (perhaps days... or weeks... or more... who knows).
Before you decide to attempt SVD on a 64GB array you should probably do an internet search on SVD of large arrays.
I suspect that you do not really appreciate that computers do not have infinite memory and infinite speed.
"I cant do that since the name inside each .mat file is the same "
How exactly do the variable names prevent you from processing your data?
Having exactly the same variable names in every MAT file makes your data easier to work with. It also makes your code simpler, more efficient, and more robust:
What exactly is stopping you from LOADing into an output variable and accessing its fields?
Show us the exact code that you tried.
"I want to pick up from each folder 15 files of images (4D double) and their respective files of labels and combine the images and combine the labels"
In this thread you wanted to combine around 400 "images" into a 64GB array:
Now you only want to combine "15 files of images".
Which is correct? When your explanations keep changing then it makes it hard to understand what you want.

Sign in to comment.

More Answers (1)

Often, you would put all images with the same label in the same folder -
Folders={'Label1','Label2','Label3'};
for j=1:numel(Folders)
mkdir(Folders{j});
for i=1:10
image=rand(10);
save( fullfile( Folders{j}, "Image_"+num2str(i,'%3d') ) , 'image')
end
end

2 Comments

M
M on 14 Nov 2023
Edited: M on 14 Nov 2023
@Catalytic I don't want to put all the images of the same label together for now
I just want to save the two .mat files as it is with their iterations name in the same of the code directory to not overwrite.
As well I got the following error after trying your code:
Error using save
Argument must be a text scalar.
Thanks
We are not here to refine code for you down to the last detail. The example clearly shows how you can create file names in a loop to use with the save() command. Adapt it as you see fit.

Sign in to comment.

Categories

Find more on Images in Help Center and File Exchange

Asked:

M
M
on 14 Nov 2023

Edited:

on 15 Nov 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!