@the cyclist: This question is a follow up question to an earlier question created this afternoon.

3 views (last 30 days)

Show older comments

I'm working on removing a for loop in my Matlab code to improve performance. My original code has one for loop (from j=1:Nx) that is harmful to performance (in my production code, this for loop is processed over 20 million times if I test large simulations). I am curious if I can remove this for loop through vectorization, repmat, or a similar technique. My original Matlab implementation is given below.

clc; clear all;

% Test Data

% I'm trying to remove the for loop for j in the code below

N = 10;

M = 10;

Nx = 32; % Ny=Nx=Nz

Nz = 32;

Ny = 32;

Fnmhat = rand(Nx,Nz+1);

Jnmhat = rand(Nx,1);

xi_n_m_hat = rand(Nx,N+1,M+1);

Uhat = zeros(Nx,Nz+1);

Uhat_2 = zeros(Nx,Nz+1);

identy = eye(Ny+1,Ny+1);

p = rand(Nx,1);

gammap = rand(Nx,1);

D = rand(Nx+1,Ny+1);

D2 = rand(Nx+1,Ny+1);

D_start = D(1,:);

D_end = D(end,:);

gamma = 1.5;

alpha = 0; % this could be non-zero

ntests = 100;

% Original Code (Partially vectorized)

tic

for n=0:N

for m=0:M

b = Fnmhat.';

alphaalpha = 1.0;

betabeta = 0.0; % this could be non-zero

gammagamma = gamma*gamma - p.^2 - 2*alpha.*p; % size (Nx,1)

d_min = 1.0;

n_min = 0.0; % this could be non-zero

r_min = xi_n_m_hat(:,n+1,m+1);

d_max = -1i.*gammap;

n_max = 1.0;

r_max = Jnmhat;

A = alphaalpha*D2 + betabeta*D + permute(gammagamma,[3,2,1]).*identy;

A(end,:,:) = repmat(n_min*D_end,[1,1,Nx]);

b(end,:) = r_min;

A(end,end,:) = A(end,end,:) + d_min;

A(1,:,:) = repmat(n_max*D_start,[1,1,Nx]);

A(1,1,:) = A(1,1,:) + permute(d_max,[2,3,1]);

b(1,:) = r_max;

% Non-vectorized code - can this part be vectorized?

for j=1:Nx

utilde = linsolve(A(:,:,j),b(:,j)); % A\b

Uhat(j,:) = utilde.';

end

end

end

toc

Here is my attempt at vectorizing the code (and removing the for loop for j).

% Same test data as original code

% New Code (completely vectorized but incorrect)

tic

for n=0:N

for m=0:M

b = Fnmhat.';

alphaalpha = 1.0;

betabeta = 0.0; % this could be non-zero

gammagamma = gamma*gamma - p.^2 - 2*alpha.*p; % size (Nx,1)

d_min = 1.0;

n_min = 0.0; % this could be non-zero

r_min = xi_n_m_hat(:,n+1,m+1);

d_max = -1i.*gammap;

n_max = 1.0;

r_max = Jnmhat;

A2 = alphaalpha*D2 + betabeta*D + permute(gammagamma,[3,2,1]).*identy;

A2(end,:,:) = repmat(n_min*D_end,[1,1,Nx]);

b(end,:) = r_min;

A2(end,end,:) = A2(end,end,:) + d_min;

A2(1,:,:) = repmat(n_max*D_start,[1,1,Nx]);

A2(1,1,:) = A2(1,1,:) + permute(d_max,[2,3,1]);

b(1,:) = r_max;

% Non-vectorized code - can this part be vectorized?

%for j=1:Nx

% utilde_2 = linsolve(A2(:,:,j),b(:,j)); % A2\b

% Uhat_2(j,:) = utilde_2.';

%end

% My attempt - this doesn't work since I don't loop through the index j

% in repmat

utilde_2 = squeeze(repmat(linsolve(A2(:,:,Nx),b(:,Nx)),[1,1,Nx]));

utilde_2 = utilde_2(:,1);

Uhat_2 = squeeze(repmat(utilde_2',[1,1,Nx]));

Uhat_2 = Uhat_2';

end

end

toc

diff = norm(Uhat - Uhat_2,inf); % is 0 if correct

I'm curious if repmat (or a different builtin Matlab function) can speed up this part of the code:

for j=1:Nx

utilde = linsolve(A(:,:,j),b(:,j)); % A\b

Uhat(j,:) = utilde.';

end

Is the for loop for j absolutely necessary or can it be removed?

Bruno Luong
on 29 Jul 2021

If you have C compilers the fatest methods are perhaps mmx and MultipleQR avaikable on FEX

Matt J
on 29 Jul 2021

Edited: Matt J
on 29 Jul 2021

Another idea.

clc; clear all;

% Test Data

% I'm trying to remove the for loop for j in the code below

N = 10;

M = 10;

Nx = 32; % Ny=Nx=Nz

Nz = 32;

Ny = 32;

AA=kron(speye(Nx),ones(Nx+1));

map=logical(AA);

% Original Code (Partially vectorized)

tic

for n=0:N

for m=0:M

....

%Vectorized code

AA(map)=A(:);

Uhat=reshape(AA\b(:),Nx+1,Nx).';

end

end

toc

Matt J
on 29 Jul 2021

Yeah, I didn't see that A was complex-valued. So,

AA(map)=rehape(A,[],1);

Uhat=reshape(AA\b(:),Nx+1,Nx).';

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!