PROBLEM: How can I update only the changed variables in my saved workspace?

After 30 minutes of running, my workspace reach 3Gb (I have a quite large data and the code is quite complicated).
I saved it, which take a little time too, to continue my work without losing 30 minutes again the next time I lunch matlab.
When I run only a part of the code, some variables change and some new appear.
I am looking for a methode to update my saved workspace by adding/changing some variables without re-saving the others.
Thanks in advance!

2 Comments

Are you talking about your base workspace or a function workspace? It sounds like you're running scripts that change the variables stored in your base workspace and this can get very messy. This is why it's good practice to work with functions rather than scripts since functions contain their own workspace which eliminates the possiblity of that workspace being contaminated with foreign variables or the values of internal variables changing due to unrelated scripts.
If you're talking about your function workspace, it should be clear which variables potentially change within sections of your code.
I'm talking about my base workspace. I saved it in a .mat file so that I can skip some steps when I run my code.
When only some variables are changed (like 10), I would like to save them in the same .mat file by updating it. Is it possible?

Sign in to comment.

Answers (2)

(continuing from the comments under the question)
To save a specific list of variables to an existing mat file, list the variable names as strings and include the '-append' flag.
save('myfile.mat','var1','var2', 'var3','-append')
More info:
That being said, I want to reiterate a cautionary note when working with variables in the base workspace. This method is prone to error and allows for contamination of variables and is often times difficult to troubleshoot the source of error. The better alternative is to control computations within functions.
This would be much easier when you use functions and separate workspaces. Running a bunch of scripts and collecting data in the base workspace is extremely prone to errors e.g. by overwriting variables unintentionally. Even if you save the data from time to time, it is hard to reproduce the results.
But if you really do it, you can create a list of the stored variables:
Accessing data in the base-workspace is evil and a shot in your knee!
% UNTESTED !!! %
function SaveChangedWorkspace(File)
% File is the core of the file name: C:\Temp\YourData
List = dir(File, '*.mat');
% Copy from all MAT files the field 'Hash' and collect it:
oldHash = struct([]);
for iFile = 1:numel(List)
Data = load(fullfile(List.folder, List.name), 'Hash');
oldHash = JoinStruct(oldHash, Data.Hash);
end
% Now get the hash values of the variables in the workspace:
Vars = evalin('base', 'who'); % EVIL! I'd never use this is productive code
ToSave = struct([]);
Changed = true(size(Vars));
Hash = struct([]);
for iVar = 1:numel(Vars)
Name = Vars{iVar};
Data = evalin('base', Name);
thisHash = DataHash(Data); % Or: GetMD5(Data, 'Array')
if isfield(oldHash, Name) && isequal(oldHash.(Name), thisHash)
Changed(iVar) = false;
else
ToSave.(Name) = Data;
Hash.(Name) = thisHash;
end
end
% Save changed variables:
if ~isempty(ToSave);
ToSave.Hash = Hash;
ToSave.Date = datestr(now, 0);
save(sprintf('%s%04d.mat', File, numel(List)+1), '-struct', 'ToSave')
end
end
function S = JoinStruct(S, T)
F = fieldnames(T);
for iT = 1:numel(F)
S.(F{iT}) = T.(F{iT});
end
end
You need this: FileExchange: DataHash
This saves the modifed variables in a new file using the names "File0001.mat", and so on. You can load them all by e.g.:
function [Data, Hash] = LoadCollected(File)
% File is the core of the file name: 'C:\Temp\YourData'
List = dir(File, '*.mat');
% Copy from all MAT files the field 'Hash' and collect it:
Data = struct([]);
Hash = struct([]);
for iFile = 1:numel(List)
aData = load(fullfile(List.folder, List.name));
Name = fieldnames(aData);
for iName = 1:numel(Name)
if strcmp(Name{iName}, 'Hash')
Hash = JoinStruct(Hash, aData.Hash);
else
Data.(Name{iName}) = aData.(Name{iName});
end
end
end
end

7 Comments

Just curious, why isn't it better to just use save(..., '-append') ? With that method, you must explicitly list the variables to be updated in the mat file which should be straighforward if the scripts are running correctly.
As a lot of variables change, I wanted something that identify the changed variables and save them automatically. If I use '-append' I can forget to save one changed variable which would result in futur problems when I will load the .mat file.
I don't master well enough the usage of functions and their workspaces yet.
"I don't master well enough the usage of functions and their workspaces yet. "
You need to use functions. Scripts make complex projects very difficult to write, diffucult to understand, and difficult to debug. That is exactly what you are experiencing now.
You would be much better off learning how to use functions.
@Habib, I hear ya. But I can garauntee you that the time spent to learn how to use functions will be a much better investment than the amount of time you'll spend troubleshooting the portential errors that happen when you work solely from the base workspace. It sounds like you have many variables changing and no way to predict which variables are changing so you therefore have little chance of detecting errors and very little chance of tracing them to their source.
@Adam: "isn't it better to just use save(..., '-append')"
As far as I understand, it is time consuming to save all data. Then storing the modified or created data only saves some time. save(..., '-append') will overwrite existing data.
To be honest: I do not like the code I've posted. I would not use it for productive work, because it introduces complexity instead of supporting clean programming techniques. This is a standard step of creating an unusable program.
@Habib: Adam is right. My suggestion might be able to save the changed variables only. But a clean and structured programming technique is much more powerful. Collecting a lot of variables in the base workspace is too fragile to be used for scientific work.
Thank you all for your advices!
Do you have an advice to how to learn using function effectively, like websites, courses or books?

Sign in to comment.

Categories

Asked:

on 12 Mar 2019

Commented:

on 13 Mar 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!