Least squares problem with large matrix
7 views (last 30 days)
Show older comments
Hi, I have a simple matrix problem which I believe can be solved in two steps,
Ultimately I am looking for a least squares solution for
Ax = b
So I might say that x = inverse(A)*b. Of course these are large matrices (~1000x1000) so I would never compute an inverse, but I downloaded a Factorize package which factorizes A and never directly computes the inverse. But here is my first problem, I do not know what A is, I have a function which computes Ax for a given x, so my idea was as follows: I make up an x, and find the matrix form for A via A = b(:)*inverse(x(:)), so now A is a matrix of size (~1,000,000 x 1500), because x is size (5x300). Now if I could do this and had A, then to find x it would be a simple inverse problem, but (again utilizing the matrix factorization) but MATLAB runs out of memory. Thus I was wondering if anyone had a an alternative solution to my problem or or any help. Thanks very much!
4 Comments
Daniel Shub
on 28 Sep 2011
I am pretty sure 1000x1000 is not really that large. Unless I am missing something inv(randn(1000, 1000)); goes pretty quick.
Answers (2)
Teja Muppirala
on 1 Oct 2011
Ok. I have an idea.
So you basically have some linear function A(x) -> b that takes the vector space of [5x300] matrices and maps them to the vector space of [1000x1000] matrices.
Since any linear function on a matrix can be expressed as matrix multiplication if you write out the elements into a vector, you are expressing the problem as
A*x(:) = b(:) where A is [1000000x1500], x(:) = [1500x1], and b(:) is [1000000x1].
You want to find the least squares solution to this problem (what x best gets mapped to b), but you don't have the memory to store the matrix A.
In this case, your solution is x = (A'*A)\(A'*b), and to calculate this, you do not need to store the whole A matrix.
All you need is 1500x1500 storage for the symmetric matrix A'*A.
You have a function (I'll give it the name "calculateAx") that gives you Ax for a given x. You can use that function to generate the 1500x1500 matrix because you can evaluate all the individual columns of A one by one.
First column of A = calculateAx([1;0;0;0;0;...])
Second column of A = calculateAx([0;1;0;0;0;...])
Third column of A = calculateAx([0;0;1;0;0;...])
The MATLAB code would look something like this:
AprimeA = zeros(1500,1500);
Aprimeb = zeros(1500,1);
for m = 1:1500
Aprimeb(m) = calculateA([. . 1 . . .])'*b; %There is a 1 in the m-th position
for n = m:1500
AprimeA(m,n) = calculateA([. . . 1. . .])' * calculateA([. . . 1 . . .])
AprimeA(n,m) = AprimeA(m,n);
end
end
x =AprimeA\Aprimeb is now a 1500x1500 sized linear system.
This is a straightforward, absolutely certain way to solve your problem.
That said, calculation time might be a problem. You have to evaluate over a million dot products of 1 million elements in addition to calling your calculateAx function over a million times, and that might take along time. I see 3 solutions to that problem.
First: Use parallel processing if you have the Parallel Computing Toolbox functionality available. You can change one of those loops to a parfor loop and obtain a quite generous speedup.
Second: If you know somehow that the columns of A are more or less independent, and you suspect that AprimeA will be strongly diagonally dominant, then instead of caculating the whole covariance matrix (A'*A is also known as the covariance matrix), just calculate the diagonal entries only, and you will still have a very good estimate for x.
Third: If the x and b in question are not just some random data, but have some sort of structure (just a guess, but from their sizes, are x and b possibly image data?), then maybe you can use a change of basis to approximate and reduce the size of the problem. For example you could use a Fourier basis and express x = [some sines and cosines]*y, and then solve for a reduced orded y instead. You would be throwing out a lot of high frequency components, but if you never needed them in the first place, it's a very reasonable thing to do.
See Also
Categories
Find more on Linear Algebra in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!