Efficient ways to store a large number of variables
33 views (last 30 days)
Show older comments
I would like to get some suggestions on efficient ways of dealing with many variables at once. I have a simulation algorithm such that every time it is run, it spits out multiple outputs of different types including arrays and scalars. Hence, if I ran the algorithm many times involving many different parameter choices the number of outputs to be dealt with explodes.
I have tried the following ways of storing the variables in a way such that they are easy to keep track off with the advantages and disadvantages as follows: 1.Storing each variable indepenently and numbering them: This is relatively faster but gets messy pretty quickly. An answer to one of my previous posts also suggested this was not a good way of dealing with this. 2.Storing the outputs of each trial in a cell array: This makes things much neater to deal with but slows computational time down considerably and takes up much more memory. (For context, the size of the matfile needed to store the data using the first method caps at a few mbs while it could go up to a few gbs with the second method).
Are there other ways to store a large number variables in MATLAB apart from the methods above? Any help would be appreciated.
4 Comments
Stephen23
on 31 Aug 2023
Edited: Stephen23
on 31 Aug 2023
I doubt that creating/accessing a cell array is going to use much time compared to your black-box "simulation algorithm". What is more likely is inefficient array expansion or something of that ilk (possibly in some code that you have not shown us):
Note that instead of preallocating a scalar cell array like this:
results = cell(1,1);
and then expanding it seven times (inefficient), you should preallocate the array to the correct size:
results = cell(1,8);
Or even skip the preallocation entirely by simply creating the cell array using curly braces:
results = {avgpop, avgprog, exttime, maxtime, runtime, maxpop, maxprog, catrep};
"This makes things much neater to deal with but slows computational time down considerably and takes up much more memory"
Given that you did not preallocate correctly, it is possible that your code has more basic inefficiencies in how you are creating or accessing that data. It is possible that simple numeric arrays may be more efficient. But until you show us the actual code, we cannot help you much more.
Answers (1)
Dyuman Joshi
on 1 Sep 2023
I have made some corrections in your code, with comments/explainations using double percentage comments %%
Please check it and let me know how it performs, if there are any errors (I can not run the code as I don't have the inputs) and if the results are good or not.
I assume birthfunc and deathfunc are functions, why not include them as a local function rather than passing them as inputs.
function results = tpbdsimn(z_init,x_init,iter, birthfunc,deathfunc,range,n,catrate,p,mode)
tstart = tic;
distime = linspace(0,range,n);
time = distime;
catrep = 0;
%%As these variables are defined via a function, and they are not growing
%%w.r.t the for loop, you can skip preallocating them
%meanzv = zeros(1,n);
%meanxv = zeros(1,n);
%%Corrected preallocation
exttime = zeros(1,iter);
maxtime = zeros(1,iter);
%%Added preallocation for maxpop
maxpop = zeros(1,iter);
t = 0;
xrec = zeros(iter,n);
zrec = zeros(iter,n);
iterrecord = zeros(1, iter);
for j = 1:iter
%%These brackets are superfluous, remove them
xvec = x_init;
zvec = z_init;
tvec = 0;
i = 1;
%%Why are these commands here when you have pre-allocated
%%the variables above
%exttime = [];
%maxtime = [];
%%These variables have not been used anywhere in the code
%%remove them
%ctimev = [];
%csizev = [];
tj = tic();
% While loop for simulation of population dynamics
while (t < range && zvec(i) > 0) % Loop must run only when both the current total progeny and population > 0
z = zvec(i); % Update values of the current pop.size and total progeny
x = xvec(i);
birthrate = z*birthfunc(x); % Compute the current total progeny birth and death rates
deathrate = z*deathfunc(x);
% Calculate the probability of a birth given the current rates
birthprob = birthrate/(deathrate + birthrate + catrate);
deathprob = deathrate/(birthrate + deathrate + catrate);
prob = rand(1); % Calculate the time until next birth and death event
deltat = -(log(1-rand(1))/(birthrate + deathrate + catrate));
t = t + deltat;
tvec(i+1) = t;
% Make corresponding changes in the total progeny and pop.size
% depending on simulation results
if prob <= birthprob
z = z + 1;
x = x + 1;
xvec(i+1) = x;
zvec(i+1) = z;
elseif (birthprob < prob) && (prob < (birthprob+deathprob))
z = z - 1;
zvec(i+1) = z;
xvec(i+1) = x;
if z == 0
exttime(j) = t; % There was a bug here
end
else
k = randi([1 z],1)*(mode == "Uniform") + sum(rand(1,z)<p)*(mode == "Binomial"); % Catastrophe size (Uniform)
% k = sum(randi(1,z)<p); % Catastrophe size (Binomial)
z = z - k;
zvec(i+1) = z;
xvec(i+1) = x;
if z <=0
exttime(j) = z;
catrep = catrep + 1;
end
%ccount = ccount + 1;
end
i = i + 1;
end
% Find the time taken to reach maximum population
[m,i] = max(zvec);
maxtime(j) = tvec(i);
maxpop(j) = m;
% Create mean vectors for interpolation
meanxv = interp1(tvec,xvec,distime,"previous");
meanzv = interp1(tvec,zvec,distime,"previous");
meanxv(isnan(meanxv)) = xvec(end);
meanzv(isnan(meanzv)) = 0;
xrec(j,:) = meanxv;
end
catrep = catrep/iter;
%%Why define a variable and assign it to another variable to store
%%when you can directly assign to the name you want to store it as
avgprog = mean(xrec);
avgpop = mean(zrec);
maxprog = xrec(:,n);
% Storing results
runtime = toc(tstart);
results = {avgpop,avgprog,exttime,maxtime,runtime,maxpop,maxprog,catrep};
end
0 Comments
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!