How can i calculate very large distance matrix?

2 views (last 30 days)
I am working on 2D simulation of particles moving in a plane. The particles interact with other particles in some interaction radius, r. Thus i have to calculate interparticle distances, which gets messy as number of particles gets large. (I need large number of particles for better statistics and less boundary effects.)
I am currently doing this:
N=1024*1024;
x=rand(1,N); y=rand(1,N); velocity=2*pi*(rand(1,N)-0.5); %unit magnitude
for t=1:1000
D = pdist([x' y'],'euclidean'); M=sqaureform(D);
[I,J]=find(0<M & M<r);
%then using loop to add the interaction:
for i = 1:N
interacting_particles = I(J==i);
if ~isempty(interacting_particles)
average_velocity(i) = atan2(mean(sin(velocity(interacting_particles))),mean(cos(velocity(interacting_particles))));
else
average_velocity(i) = velocity(i);
end
end
velocity=average_velocity;
%modify x and y
x=x+cos(velocity);
y=y+sin(velocity);
relevant_quantities(t)=stat_mech(L,t,x,y,theta);
end
relevant_quantities is a well defined function.
I hope my problem is clear. Also, I have access to an HPC server, but I really wish to do something cheaper than calculating D array of size 4096 GB.
  2 Comments
Matt J
Matt J on 24 Dec 2020
Edited: Matt J on 24 Dec 2020
What is N? Is it the number of particles, length(x)? Why is there a loop over t, but nothing in the loop depends on t? What is "list"? Is it the same as "interacting_particles"?
Ashutosh Shukla
Ashutosh Shukla on 24 Dec 2020
Thank you for pointing these out these errors in question. N is the number of particles, i corrected that in the code. In the loop over t, x, y and theta are changed according to the rules on each iteration. I usually calculate quantities that are dependent on x, y, theta for each t, which i thought will be irrelavant to mention, but if it helps to understand i will put in a function.

Sign in to comment.

Answers (1)

Raunak Gupta
Raunak Gupta on 30 Dec 2020
Hi Ashutosh,
I understand you want to calculate the number of particles and its locations from a particular particle. Since the bottleneck here is to store the whole distance matrix into the workspace, I think in that case rangesearch can be helpful which can return the points within a distance r. I think the inner for loop running from 1 to N can also be used for calculating these distances so no overhead with this approach. Also, I don’t see an explicit need of D matrix anywhere except for finding the interacting particles.
Hope this helps!
  1 Comment
Ashutosh Shukla
Ashutosh Shukla on 1 Jan 2021
Edited: Ashutosh Shukla on 2 Jan 2021
Yeah, thanks for the reply. But the problem i checked is more severe with rangesearch. I am implementing as follows:
x=8*rand(1,4096);y=8*ran(1,4096);theta=2*pi*rand(1,4096);
P=[x',y'];
for i=1:4096
Idx=rangesearch(P,P(i,:),r);
if ~isempty(Idx{1})
average_velocity(i) = atan2(mean(sin(theta(Idx{1}))),mean(cos(theta(Idx{1}))));
else
average_velocity(i) = theta(i);
end
end
The pdist version runs much faster than rangesearch. At higher values of N, the speed is much slower. For 8192 partcies the pdist version of this averaging is 2 seconds, while the suggested averaging takes 2 minutes.

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!