Quicker way of summating data across multiple parent fields
1 view (last 30 days)
Show older comments
I have a structure composed of 6 levels.
Level 6 contains a (i,j,k) 3D numerical array containing the data.
The third level [DataPointID in the code] is made up of 1000 fields. I want to find the total of x and z values for each y parameter of a 3D array across all 1000 fields.
I'm currently using nested for loops to index the matricies within the structure and an accumulating temporary variable to sum across fields.
The problem arises when I scale up the number of fields for DataPointID as required by my project, the program takes too long to run. I was wondering if there were any techniques I could use, or any other way to speed up this process that doesn't use looping.
The nested loop I am using to index and sum is shown below:
% Create new table to store summed [X x Z] matricies for each variable and row
NewTable=table('Size',[10 3],'VariableTypes',{'cell' 'cell' 'cell'});
NewTable.Properties.VariableNames={'Var1' 'Var2' 'Var3'};
% Y parameter in dataset identified by table rows
SummingMatrix=zeros(13,5);
DataPointID=fieldnames(Database.DataLevel1);
for p = 1:3 % Table var sweep
for j=1:10 %Table row sweep
for idx=1:numel(DataPointID) % For all data points
% matricies summed in x and z co-ordinate across
% DataPointsID for each row using equation below
SummingMatrix=SummingMatrix+...
squeeze((Database.Datalevel1.(DataPointID{idx}).(NewTable.Properties.VariableNames{p}).Datalevel4.Datalevel5(i,:,:)));
end
% Assign fully accumulated matrix to external table
% containing summed x and z values for each y parameter
NewTable(i,p)={SummingMatrix};
SummingMatrix=zeros(13,5);
end
end
Any help would be highly appreciated. Thank you in advance.
1 Comment
Peter Perkins
on 17 Jul 2023
Ammar, your description of what you have is badly lacking in detail, and likely none of the suggestions you have been given are what you want to to. Mosat likely, you want to avoid a very complicated organization of your data. But with so little to go on ...
Provide a SMALL CONCRETE example of what you have and what you want.
Answers (1)
Shubham
on 2 Jun 2023
Hi Ammar,
There are a few things you can do to speed up your code.
- Use vectorized operations instead of nested loops. This will allow the compiler to optimize the code and make it run faster. For example, instead of using a nested loop to sum the values in a matrix, you can use the sum() function.
- Use a more efficient data structure. The current data structure is a nested structure, which can be slow to access. You could use a more efficient data structure, such as a hash table or a binary tree.
- Use parallel processing. If you have a multi-core computer, you can use parallel processing to speed up your code. This involves dividing the work into smaller tasks and running them on multiple cores at the same time.
Here is an example of how you can use vectorized operations to speed up your code:
% Original code
SummingMatrix = zeros(13,5);
for p = 1:3
for j = 1:10
for idx = 1:numel(DataPointID)
SummingMatrix = SummingMatrix + ...
squeeze((Database.Datalevel1.(DataPointID{idx}).(NewTable.Properties.VariableNames{p}).Datalevel4.Datalevel5(i,:,:)));
end
end
end
% Vectorized code
SummingMatrix = sum(Database.Datalevel1.(DataPointID).(NewTable.Properties.VariableNames{p}).Datalevel4.Datalevel5(i,:,:), 2);
As you can see, the vectorized code is much shorter and simpler. It also runs much faster.
Here is an example of how you can use a more efficient data structure to speed up your code:
% Original data structure
Database = struct();
Database.DataLevel1 = struct();
Database.DataLevel1.DataPointID = {'DataPoint1', 'DataPoint2', ..., 'DataPoint1000'};
Database.DataLevel1.(NewTable.Properties.VariableNames{1}).Datalevel4.Datalevel5 = ...
rand(13,5,1000);
...
% New data structure
Database = containers.Map('KeyType','char', 'ValueType', struct());
for idx = 1:numel(Database.DataPointID)
Database(Database.DataPointID{idx}) = ...
struct(NewTable.Properties.VariableNames{1} => rand(13,5,1));
end
It uses less memory and it can be accessed much faster.
Here is an example of how you can use parallel processing to speed up your code:
% Original code
SummingMatrix = zeros(13,5);
for p = 1:3
for j = 1:10
for idx = 1:numel(DataPointID)
SummingMatrix = SummingMatrix + ...
squeeze((Database.Datalevel1.(DataPointID{idx}).(NewTable.Properties.VariableNames{p}).Datalevel4.Datalevel5(i,:,:)));
end
end
end
% Parallel code
num_cores = 4;
parfor p = 1:3
SummingMatrix(p, :) = parfor_fun(p, Database, i);
end
function SummingMatrix = parfor_fun(p, Database, i)
SummingMatrix = zeros(13,1);
for j = 1:10
for idx = 1:numel(Database.DataPointID)
SummingMatrix = SummingMatrix + ...
squeeze((Database.Datalevel1.(DataPointID{idx}).(NewTable.Properties.VariableNames{p}).Datalevel4.Datalevel5(i,:,:)));
end
end
end
I hope this helps!
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!