Compute only a few entries of a big matrix product

2 views (last 30 days)
Suppose I have a tall matrix x (say 10000 by 10), and a collection of indices {N1, N2, ..... Nk} where each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4), and {Ni} can overlap. Suppose I want to compute X:=xx', but am only interested in X(Ni, Ni) for all i, and I can set other entries of X to 0. Is there a way to quickly form such an X without doing the entire matrix product xx' and without using forloop?
  6 Comments
Matt J
Matt J on 20 Nov 2023
Edited: Matt J on 20 Nov 2023
each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4)
Is |Ni| fixed, i.e., independent of i?

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 20 Nov 2023
Edited: Bruno Luong on 20 Nov 2023
It will not store the result in 2D matrix but 3D array. I hope you can switch to this format of storage.
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.325735 seconds.
tic
[p,q] = size(x);
x(end+1,:)=NaN;
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
toc
Elapsed time is 0.126183 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
Y(1:l(1),1:l(1),1)
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
X(N{k},N{k})
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
Y(1:l(k),1:l(k),k)
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
  2 Comments
Bruno Luong
Bruno Luong on 21 Nov 2023
Edited: Bruno Luong on 21 Nov 2023
This version putback the 3D array result to (sparse) matrix
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.307213 seconds.
tic
[p,q] = size(x);
x(end+1,:)=0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I=repmat(NA,m,1);
J=repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
keep = all(IJ<=p,2);
XX=sparse(IJ(keep,1),IJ(keep,2),Y(loc(keep)),p,p);
toc
Elapsed time is 0.162196 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
full(XX(N{1},N{1}))
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
X(N{k},N{k})
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
full(XX(N{k},N{k}))
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
Bruno Luong
Bruno Luong on 22 Nov 2023
A variation, avoid filter with subindexing, since the elements of the last row and column are 0
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
[p,q] = size(x);
x(end+1,:) = 0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I = repmat(NA,m,1);
J = repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
IJ(IJ>p) = 1;
XX = sparse(IJ(:,1),IJ(:,2),Y(loc), p, p);
toc
Elapsed time is 0.108971 seconds.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!