Map a function over specific dimensions of matrix

10 views (last 30 days)
Consider an N dimensional matrix M. Consider a function f that eats k-dimensional matrices.
I want to build a function F such that the following code :
B=F(M,@f,k);
Returns a matrix B having N-k dimensions. The coefficient B(i_1,...,i_{N-k}) is equal to f(squeeze(M(i_1,...,i_{N-k},:))
To do so I created the following function :
function B = mapping(M, f, k)
% MAPPING map a function f that must have a k dimension matrix as input to
% the last k dimensions of a matrix M
% B = mapping(M,@f,k) returns a matrix M of N-k dimensions where N is the
% dimension of M such that B(i1,..,i_{N-k})=f(M(i1...i_{N-k},:))
size_M=size(M);
% We start by creating an intermediate matrix A that will be required in
% the function arrayfun. This matrix has the same size as the returned B
% matrix but verifies A(k)=k where k is the 1 dimensional indice
if(numel(size_M)-k>1)
% general case where A (and B) will have at least 2 dimensions
A=[1:prod(size_M(1:end-k))];
A=reshape(A, size_M(1:end-k));
elseif(numel(size_M)-k==1)
A=[1:size_M(1)];
else
error("The function to map maps all the dimensions of the matrix or more.");
end
function fct = F(Ai)
% We first take the subscript associated to the number Ai
b_dim=numel(size_M(1:end-k));
subscripts=cell(b_dim,1);
% We take the indices corresponding to Ai
[subscripts{:}]=ind2sub(size_M(1:end-k),Ai);
subscripts=subscripts.';
% We then apply the function f to the good element of matrix M
fct=f(squeeze(M(subscripts{:},:)));
end
B=arrayfun(@F, A);
end
My question are :
Above all : does it actually already exist a matlab function doing the purpose I want ? Or am I forced to create it ?
It it doesn't already exist such a function I would like to know if there are tricks to speed up what I want, because here it is too slow for my purpose.
If you want to make a try :
M=rand(8,10,2)
% The matrix B will have 2 dimensions (B is a 8x10 matrix) and B(i,j)=norm(squeeze(M(i,j,:)))
B=mapping(M,@norm,1);
  2 Comments
Guillaume
Guillaume on 1 Nov 2018
I'm not sure I completely understand what you're trying to do. It does look like it's more complicated than it should be.
One thing that is not consistent with your explanation and code: you say that f works on matrices with k dimensions, yet squeeze(M(i_1,...,i_{N-k},:) implemented in your code as squeeze(M(subscripts{:},:)| is always going to be a column vector.
M(subscripts{:},:) is going to be 1 x 1 x 1 ... x prod(size_M(end-k+1:end), that is a vector in the (N-k)th dimension where dimension N-k+1 to N are all concatenated in dimension N-k.
It may be a minor issue since you get the result you want, but the input to f is rarely going to have k dimensions.
StarBucK
StarBucK on 1 Nov 2018
Actually you are right i just figured out by posting the message that it can only work with a function working with 1D matrix (vectors).
So two questions actually, how optimize the code (so for f working with vectors) and how can i modify to let f working on more than 1d matrices ? I need to replace M(subscripts{:},:) by another expression but i dont know how to make it.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 2 Nov 2018
You would have to modify your function F like this to achieve what I think you want:
function fct = F(Ai)
% We first take the subscript associated to the number Ai
b_dim=numel(size_M(1:end-k));
subscripts=cell(1, b_dim);
% We take the indices corresponding to Ai
[subscripts{:}]=ind2sub(size_M(1:end-k),Ai);
% replicate individual indices to the size of the k-dimensional submatrices
size_k = size_M(end-k+1:end);
subscripts = cellfun(@(idx) repmat(idx, size_k), subscripts, 'UniformOutput', false);
% emulate colon for the remaining dimension by explicitly listing all indices
k_subscripts = arrayfun(@(sz) 1:sz, size_k, 'UniformOutput', false);
[k_subscripts{:}] = ndgrid(k_subscripts{:});
% convert subscripts to linear indices and index
M_k = M(sub2ind(size(M), subscripts{:}, k_subscripts{:}));
% We then apply the function f to the good element of matrix M
fct=f(M_k);
end
That is you have to translate your subscripts back to linear indices and explicitly list all the indices for the remaining dimensions instead of an unknown number of colons.
However, yes the whole thing is way overly complicated. You only need to do:
function B = mapping(M, f, k)
B = cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k+1:ndims(M)));
end
Alternatively, you could permute the dimensions which would avoid the squeeze in the cellfun (and hence the cost of the anonymous function) but require a squeeze at the end. As permute is expensive, you may not gain anything:
function B = mapping(M, f, k)
B = squeeze(cellfun(f, num2cell(permute(M, [k+1:ndims(M), 1:k]), 1:k)));
end
  4 Comments
StarBucK
StarBucK on 2 Nov 2018
Edited: StarBucK on 2 Nov 2018
For the first part of your message : so actually the only thing I have to decide is to take a convention, the job is already done. Like if my function f returns me a matrix of p dimension the only thing I have to consider is how to organize my new dimensions right ?
For your edit : I am not very aware of those considerations. Imagine I have a matrix A, the distance between A(k) and A(p) is something like Abs(k-p) I assume. But if I do A(k)+A(p), matlab doesn't know if they are either far or close right ? (it would be the same time of execution in both case) So just to understand, why is it important that the data are close in memory in this example ?
Guillaume
Guillaume on 2 Nov 2018
First part: yes. The simplest would be to concatenate the p-dimensioned matrices in the p+1 dimension:
B = cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k)), 'UniformOutput', false);
B = cat(ndims(B{1})+1, b{:});
2nd part: In theory it will have an impact on num2cell although I haven't investigated how much. Let's take the example
>>A = cat(3, [1 3 5; 2 4 6], [7 9 11; 8 10 12])
A(:,:,1) =
1 3 5
2 4 6
A(:,:,2) =
7 9 11
8 10 12
It is stored in memory as
[1 2 3 4 5 6 7 8 9 10 11 12]
If you issue num2cell(A, 1) matlab just has to split the matrix along contiguous blocks of memory:
{[1;2], [3;4], [5;6], etc..}
If you issue |num2cell(A, [1 2]), same:
{[[1;2] [3;4] [5;6]], [[7;8] [9;10] [11;12]]}
Copying continuous blocks of memory is fast, particularly with modern processors design.
However, if you use num2cell(A, 2) matlab has to skip memory blocks to fetch the elements of each matrix:
{[1 3 5], [2 4 6], [7 9 11], [8 10 12]}
More importantly, if the submatrices to pass to f are always in the first k dimensions, you don't even need the num2cell. A simple loop would work:
function B = mapping(M, f, k)
%k: integer < ndims(B) specifying which first k dimensions to pass to f
size_M = size(M);
size_k = size_M(1:k);
B = zeros(size_M(k+1:end));
stride = prod(size_k);
startidx = 1;
for idx = 1:numel(B);
B(idx) = f(reshape(M(startidx:startidx+stride-1), size_k));
startidx = startidx + stride;
end
end
Again, you're iterating over continuous blocks of memory (of length stride so it will be fast.

Sign in to comment.

More Answers (0)

Categories

Find more on Performance and Memory in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!