This submission is a very simple code routine that I have used for many years for finding a maximal subset of linearly independent columns of a matrix. It is based on an old conversation with Bruno Luong, which has recently resumed here,
and where he gives some mathematical explanation behind the method. I post this here for ease of reference, as it seems to be a frequently sought tool by Matlab Community members.
Extract a linearly independent set of columns of a given matrix X
X: The given input matrix
tol: A rank estimation tolerance. Default=1e-10
Xsub: The extracted columns of X
idx: The indices (into X) of the extracted columns
>> A=eye(3); A(:,3)=A(:,2)
1 0 0
0 1 1
0 0 0
Matt J (2020). Extract linearly independent subset of matrix columns (https://www.mathworks.com/matlabcentral/fileexchange/77437-extract-linearly-independent-subset-of-matrix-columns), MATLAB Central File Exchange. Retrieved .
Basically, the QR decomposition is used to obtain a decomposition of the rank-r matrix A into the block form A*E=Q*[T,d; 0 0] where E is a column permutation matrix and T is an r-by-r upper triangular sub-matrix with non-zero decreasing diagonals. Because of the structure of the right hand side, we see that the sub-matrix Q*[T;0] has full rank r. Moreover, Q*[T;0] coincides with the first r columns of A*E and therefore these first r columns are also of full rank r. In other words, these columns are linearly independent and span the column space of A*E and therefore also of A. This is what we were looking for. Therefore, to obtain a linearly independent spanning subset of the original matrix A, we need only permute its columns to obtain A*E and extract the first r columns of A*E.
I would love to ask how would you chose independent columns from raw matrix while we only have transformed matrices? Any explanation or reference I can read would be appreciated?
@HN, It just means the code has always worked for me, but I don't know why, mathematically. It relies on the empirical observation that in the QR decomposition, the R matrix is diagonally dominant, but I haven't seen a source to say if or why that's always true. If I find out why, though, I will amend the description. Thanks for your feedback!
By the way, does the statement "but to this day, I have not been able to find a rigorous mathematical explanation for why it works" implies there is no theoretical possibility to obtain linearly independent columns for the given matrix?