Finding neighbors in pages of very large 3D matrix

1 view (last 30 days)
Hello,
Given a 3D tensor, the pages of which are 2D matrices composed of 1s and 0s, I'm looking to find an effiicient method for computing how many neighboring 1s there are in each page of the tensor, and specifically to be able to prune that 3D tensor for only the pages which have below a certain number of neighboring 1s.
An example of this 3D tensor would be created for instance by:
N = [6 3]; % Here we consider 2D matrices which are for instance 6x3
N_total = prod(N);
num_allowable_neighors = 1;
binary_1D = de2bi((1:2^N_total)-1); % We generate all possible binary vectors with this dimension
binary_dim = size(binary_1D,1);
binary_2D = reshape(binary_1D,[binary_dim N]); % We reshape the binary vectors into binary matrices
I have a working code here which is efficiently vectorized:
% We pad the entire 3D tensor to the left and above with 0s
binary_2D = cat(2,zeros(binary_dim,1, N(2)),binary_2D);
binary_2D = cat(3,zeros(binary_dim,N(1)+1,1), binary_2D);
% Now we circshift the entire 3D array along the 2D directions, and look
% for when their is overlap with the original matrix
% (using the == operation), conditioned on there being a 1 in the first place
% (using the .* operation)
% This then gets us the number of neighbors in each direction
y_neighbors = sum(sum((circshift(binary_2D,1,3) == binary_2D).*binary_2D,2),3);
x_neighbors = sum(sum((circshift(binary_2D,1,2) == binary_2D).*binary_2D,2),3);
total_neighbors = x_neighbors+y_neighbors;
good_indices = find(total_neighbors <= num_allowable_neighbors);
The above method is quite efficent for small N. However, when N_total~30 it starts getting very slow, and is practically impossible for N_total~>33 because it is impossible to even construct the binary_1D matrix (too large in memory).
I'm wondering if there is a more efficient way to do this operation which never requires explicitly constructing binary_1D, but instead builds up the result iteratively. I have such a result in 1D assuming num_allowable_neighbors is 0:
N = 20;
% first bit can be free, either 0 or 1
good_indices = [0; 1];
% fill up to Nth bit by adding only no-neighbor indices
for k = 2:N
temp = good_indices;
temp = temp(bitget(temp,k-1)==0);
good_indices = cat(1, good_indices, temp + 2^(k-1));
end
However, I am not sure how to extend this to 2D, and how to add back the functionality of allowing a specified finite number of neighbors, which is critical.
Any help on this matter would be greatly appreciated.
  6 Comments
Adam Shaw
Adam Shaw on 10 Feb 2022
Apologies, I've edited it dimH===binary_dim, I just forgot to rename it when posting the question. So it reshapes into a [2^(N(1)*N(2) N(1) N(2)] tensor.
And yes, the resultant product will still be exponentially scaling, but with a smaller exponent than 2 - and so to iteratively build it up we effectively gain some headroom in memory allowing for larger prod(N) (though of course there will be a limit). For instance, in the bit-level code I posted for the 1D case, it is much more memory-efficient/faster, but the extrapolation to 2D is not entirely clear.
Matt J
Matt J on 10 Feb 2022
Edited: Matt J on 10 Feb 2022
And yes, the resultant product will still be exponentially scaling, but with a smaller exponent than 2
I don't know you can know that the saving will be significant. The O(2^(M/2)) count that I gave was a lower bound, not an upper bound. It's not mathematically obvious (to me) that you will significantly raise your current ceiling on M.

Sign in to comment.

Answers (2)

Walter Roberson
Walter Roberson on 10 Feb 2022
rangesearch() https://www.mathworks.com/help/stats/rangesearch.html with 'kdtree', and possibly 'cityblock'
  1 Comment
Adam Shaw
Adam Shaw on 10 Feb 2022
Could you explain a little bit further? The main limitation is even constructing the very large 3D tensor (approximately something like 2^33x6x6), which then we're trying to prune down to only the pages with below a given number of neighbors. It seems this rangesearch approach would still require constructing that original tensor, whereas the ideal method would allow us to construct the desired "good_indices" without having to actually initialize the whole 3D tensor in the first place.

Sign in to comment.


Matt J
Matt J on 10 Feb 2022
Edited: Matt J on 10 Feb 2022
kernels={[1 1], [1;1], eye(2), fliplr(eye(2))};
counts=0;
for i=1:4
map = convn(binary_3D,kernels{i},'valid')==2;
counts=counts+sum(map,[1,2]);
end
binary_3D(:,:, counts>num_allowable_neighors )=[];
  3 Comments
Matt J
Matt J on 10 Feb 2022
Edited: Matt J on 10 Feb 2022
binary_3D is the name I've chosen for the given 3D array mentioned in your post. The pages are binary_3D(:,:,i), i=1,...,N which you should find are in fewer number after running the code.
If you cannot fit binary_3D in RAM, perhaps you can load it in chunks, applying the above code to each chunk. Either way, you need to clarify, for me at least, in what form the input is provided, if it is not just sitting there for us in the Matlab workspace.
Adam Shaw
Adam Shaw on 10 Feb 2022
I've edited my post to better show off for instance how this binary_3D tensor is created - I worry that for prod(N) larger than say 33 even if it is broken into chunks, the exponential size scaling will make the problem intractable to compute. That's why I was hoping to find a solution using bit-level operations to iteratively "build-up" the result, as shown for 1D in the final code-snippet of original post - that way the entire exponentially sized binary_3D tensor never needs to be created.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!