The most efficient way to add a new column or new row to a sparse matrix

33 views (last 30 days)
Hi, suppose I have a large scale sparse matrix A, i.e. size(A)=[100k,10k]. I am wondering, what is the most efficient way to append a new row/column at the end of matrix A?

Accepted Answer

James Tursa
James Tursa on 14 Jul 2017
In general, every time you significantly change the size of a sparse matrix (so that you exceed the current nzmax of the matrix) all of the current data must be copied to a new memory block. If you do this repeatedly in your program, this can seriously drag your performance. So typically the most efficient way to change the size of the sparse matrix is to gather up all of the changes you wish to make and then apply them to the sparse matrix all at once. That way this copying of the current data only happens once.
If you have to add more data to the sparse matrix in a piecemeal fashion, then the best course of action would be to allocate enough extra memory up front to hold that additional data, and then add that data in as appended column data at the end. That way the existing sparse data would not need to be copied to a new memory block every time you added the new data. E.g., a simplistic example:
I pre-allocate space for three elements, but only have one non-zero element to begin with. As new non-zero elements are appended, the memory block stays the same (i.e., the pr data pointer points to the same memory block). So the current data is undisturbed and the only thing that happens is the new data gets appended to the end of the current data. All well and good. But then with the fourth element I exceed the three element nzmax limit. This causes the entire data set to be copied over into a new memory block (i.e. the pr pointer changes).
>> format debug
>> S = sparse(1,1,1,10,10,3)
S =
Structure address = 649a7b8
m = 1
n = 1
pr = 1eda8d80
pi = 0
(1,1) 1
>> S(2,1) = 2
S =
Structure address = 649a7b8
m = 2
n = 1
pr = 1eda8d80 <-- Same memory block
pi = 0
(1,1) 1
(2,1) 2
>> S(3,1) = 3
S =
Structure address = 649a7b8
m = 3
n = 1
pr = 1eda8d80 <-- Same memory block
pi = 0
(1,1) 1
(2,1) 2
(3,1) 3
>> S(4,1) = 4
S =
Structure address = 649a7b8
m = 4
n = 1
pr = 1edf1530 <-- New memory block
pi = 0
(1,1) 1
(2,1) 2
(3,1) 3
(4,1) 4
At this point you could keep appending data until the new nzmax(S) is exceeded (whatever that happens to be), at which point the entire data must be copied again into a newly allocated memory block.
Bottom line is that this incremental process of appending data can get very expensive to your runtime if you don't have that extra space pre-allocated ahead of time. But if you create your sparse matrix from the get-go with the extra memory and you append your new data as columns at the end, then the current data will not need to be recopied every time you append new data and things will go much faster.

More Answers (1)

Andrei Bobrov
Andrei Bobrov on 13 Jul 2017
% Adding a new row to the end of the matrix A:
A(end+1,[190 200]) = [7 89];
% Adding a new column to the end of the matrix A:
A([190 200],end+1) = [7 89];

Categories

Find more on Sparse Matrices in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!