1 view (last 30 days)

Can someone suggest me a trick to make this code faster.

"all_DOY" = 1000*15000*23 ; %(i,j,k) such that k dimension is the day of the year whic is random, while (i,j) are my pixel location starting from (1,1) to (1000,15000). The values of day of the year (DOY) in the kth dimension rranges range from -1 to 366

"all_variable" = 1000*15000*23; This is my variable value for (i,j,k) for the (i,j) position and k dimension i.e., day of the year in the "all_DOY" matrix. The values in the kth dimension range from -3000 to 15000

all_variable_new = 1000*15000*366

I'm trying to update the "all_variable_new" such that, my k dimension is from 1:366 (days of the year). for each day of the year obtained from "all_DOY" matrix, the variable value is updated in the "all_variable_new" for that respective (i,J,k) from "all_variable"

all_variable_new=zeros(1000,15000,366,'int16');

all_variable = load('all_variable.mat');

all_DOY = load('all_DOY.mat');

tic

for ii =1: size(all_DOY,1) % for each ii

for jj = 1:size(all_DOY,2) % for each jj

idx = squeeze(all_DOY(ii,jj,:)); % sequeez k'th dimension. idk is the DOY which is also the position value in all_variable_new

variable = squeeze(all_variable(ii,jj,:)); % obtain the variable values

variable(idx==-1)=[]; % if -1, remove

idx(idx==-1)=[]; % if -1 then remove the values

Ind=idx;

if isempty(Ind)

all_variable_new(ii,jj,:) = NaN; % if there are no idx values then replace with NAN

else

for mm = Ind

all_variable_new(ii,jj,mm) = variable; % for the day of the year values obtained, replace those positions with variable values.

end

end

clear idx Ind mm

end

end

toc

Guillaume
on 10 Mar 2020

Edited: Guillaume
on 10 Mar 2020

If I understood correctly:

pixels_dayofyear = load('all_doy.mat');

pixels_temperature = load('all_variable.mat');

assert(isequal(size(pixels_dayofyear), size(pixels_temperature)), 'Size of matrices doesn''t match!');

%sort each pixel (row, column) temperature (pages) according to the corresponding day of year.

%temperature for invalid days of year (-1) are all replaced by 0 at the end of the page.

pixels_dayofyear(pixels_dayofyear == -1) = NaN; %replace by a value that is sorted last. Could be Inf instead of NaN

[~, order] = sort(pixels_dayofyear, 3); %sort the day of year of each pixel. Get the new order

[rows, cols, ~] = ndgrid(1:size(pixels_dayofyear, 1), 1:size(pixels_dayofyear, 2), 1:size(pixels_dayofyear, 3)); %create row and column indices replicated along the correct dimensions for sub2ind

newpixels_temperature = pixels_temperature(sub2ind(size(pixels_temperature), rows, cols, order)); %reorder temperature according to the sorting of the days

newpixels_temperature(isnan(pixels_dayofyear)) = 0; %replace invalid values by 0

Guillaume
on 12 Mar 2020

"is there a possibility to store both values intead of assuming a mean" No, a matrix element can only have one value. What would be the point anyway? Note that the above doesn't average (but my accumarray solution posted previously can), you just get the last of the two.

"A 10 day moving average, after which a linear interpolation is enough" Wouldn't you do the interpolation beforehand? Otherwise, you're averaging with a lot of zeros.

Either way, I'm not sure how you're going to perform the interpolation efficiently. With much smaller datasets, you'd use a scatteredInterpolant (which would also avoid having to construct that initial matrix full of zeros), but this requires 3 matrices the same size as your final matrix to specify the query points.

However, since you said you're not going to use neighbouring points for calculation the way to get eleminate the memory issue is to divide you satellite image into smaller areas. Split each image into 10 along the rows and columns and you need 100 times less memory.

Guillaume
on 13 Mar 2020

"While interpolating the data, it is required to have all the data even if it is same DOY, instead of mean." I'm not saying it doesn't exist, but I'm not aware of an interpolation method that can cope with two different values at the same point.

"dividing it into 10 rows, makes it hard to keep count of data blocks" I don't see why it would be hard. It's certainly an easy way to get rid of memory issue. At the same time, it also makes it possible to parallelise the processing (but beware of memory usage!) since the processing of the blocks is completely independent of each other. On a supercomputer, I assume that would be a better use of the resources. So, yes, Instead of splitting by year, I'd split by satellite coordinates and process blocks of size MxNx(10*23) -> MxNx366 where M and N are such that the data fits comfortably in memory.

Addendum: Assuming than M and N are reasonable so that the below fits in memory, then this is how I'd implement the transformation from the MxNx(10*23) to MxNx366 matrix. This directly interpolates the MxNx(10*23) data without ever creating the matrix of 0s:

%inputs:

%pixels_dayofyear, a M x N x day matrix indicating the day of measurement of the respective location in pixels_temperature

%pixels_temperature, a M x N x day matrix indicating the temperature on the respective day in pixels_dayofyear

[M, N, days] = size(pixels_dayofyear);

[inrows, incols, ~] = ndgrid(1:M, 1:N, 1:days); %create matrices of locations for each temperature

isvalid = pixels_dayofyear > -1

temperature_interpolant = scatteredinterpolant(M(isvalid), N(isvalid), pixels_dayofyear(isvalid), pixels_temperature(isvalid)); %create interpolant using locations and days as coordinates

%the line below creates 3 double matrices of size M x N x 366. They will use a lot of memory for large M and N!

[qrows, qcols, qday] = ndgrid(1:M, 1:N, 1:366); %matrices of all query locations and days

interpolated_temperature = reshape(temperature_interpolant(qrows, qcols, qday), M, N, 366); %interpolate at all query points and reshape into 3D matrix

Beware that qrows, qcols and qday will require M*N*366*8 bytes of memory. ScatteredInterpolant only accepts doubles so there's no way to save memory there by using a smaller type. However, these 3 arrays can be used unchanged for all the blocks (assuming they're all the same size).

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
## 4 Comments

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808172

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808172

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808200

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808200

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808223

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808223

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808230

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/510156-can-someone-suggest-to-make-this-code-faster#comment_808230

Sign in to comment.