# Pre-allocation of large data structure seems to slow down simulation.

2 views (last 30 days)
Robbin van Hoek on 11 May 2015
Commented: Robbin van Hoek on 12 May 2015
Consider the following example code:
%%simulation example
tend=6*60*60
dt=0.1
t=dt:dt:tend;
Car = InitializeCar %function to initialize Car structure
Data = InitializeData(t) %function to initialize Data structure
for ii=1:length(t)
Car = Position(Car,Options)
Car = Speed(Car,Options)
Car = FindInterAction(Car,Options)
Car = SetNewValues(Car,Options)
Data = Datalogger(Data,Car,Options,ii)
end
%datalogger is of the form:
function Data=Datalogger(Data,Car,Options,ii)
for kk=1:length(Car)
Data(kk).speed(ii) = Car(kk).speed
Data(kk).position(ii) = Car(kk).position
end
The problem that I encounter has to do with a pretty lengthy simulation, from which I store only a certain amount of specified variables each step. Car is a structure of length 52, from which the fields are updated (=overwriten) every timestep. This structure represents 52 cars driving around and interacting with eachother, with their current states stored in fields. these field also include large data structures which define properties of the car.
The simulation itself is done with a number of individual functions, which all use a second structure 'Options', in which simulation modes are specified. These functions update the current states of each of the cars.
At the end of each time step a selected number of fields is stored in a Data structure. What i used to do is initialize the data structure by pre-allocation. I defined each field (in the example only position and speed, in my actual simulation i need to store 18 doubles) by an array nan(length(t),1).
For smaller tend (tend < 3600) the simulation is able to run faster then 'real time'. that is a timestep of 0.1 seconds will take less then 0.1 seconds to run. However if i grow the total simulation time up to 6 hours (which is required for the study i am performing) the time required per timestep also grows. I noticed that this was due to the Datalogger function, since after omitting this function the speed returned to that of the shorter simulations.
To my surprise I found that omitting the pre-allocation of the data structure actually speeds up my simulation by approx. a factor 3. This does not make any sense to me, since this means that every timestep, the fields of the data structure are grown, vs only updating single values for the case of pre-allocation.
I suspect that because initially the Data struct is still small which allows the program for some reason to handle it faster initially, but that one the end of the simulation is reached, it will become slower.
My question in this is actually two-fold:
• 1. What is the reason that my code becomes slower when growing the length of the simulation? I thought because every timestep only 52 doubles in the data structure are updated this should not be influenced by the total size of the arrays in the fields of the data structure
• 2. How can i improve my code to be faster no matter the maximum size of t? Basically this boils down to: How can i efficiently store the required values in the data structure?
• Defining the data structure as a class instead of a structure. This had however no effect (or maybe very small effect) on the simulation time.
• Defining multidimensional fields withing the data structure. instead of storing data like this: Data(kk).speed(ii), it is now stored as Data.speed(kk,ii). This improves the performance slightly, but it is not the speed up that i'm looking for.
Robbin van Hoek on 12 May 2015
I've actually figured out what the problem was.
when making the class definition I didnt include the statement that i would like to use it as a handle. by inluding the "< handle" statement, i managed to get the desired speed up of a factor to of my simulation.
classdef class_Car < handle