Create sparse matrices with integer rows and columns

Hi,
I am creating a fairly large sparse matrix from row-column-value triplets. Row and column numbers are obviously integer, so I tried to change the types of the row and column arrays from 'double' to 'uint32', in an attempt to reduce the memory used. I get the following error message: "Undefined function 'sparse' for input arguments of type 'uint32'."
Is there a way to format the row and column arrays so that less than 8 bytes per entry are used?
Thanks!

 Accepted Answer

5 Comments

Thanks. I understand that developers from Matlab didn't want to program sparse matrices for every single type that Matlab offers. But I am surprised that the types for rows and columns are not more optimised. Rows and columns are never real doubles, whatever the values in the matrix are. This matters in my program because I have rather limited RAM.
But doubles can encode integers too! In fact, they do so very well.
Making the row and column indices uint32 will not create that much of a serious savings in size. Worse, before long, people will be wanting to create sparse matrices that would overflow a uint32 for row and column indices. So if TMW allowed this to be done, before long, it would have been wasted effort, just making the software more complex, more subject to bugs.
(Another reminder for me to finish up my mex-based int8 and uint8 sparse class. Uses a sparse logical for the fundamental storage and the mex code is used to interpret the 1-byte data as int8 or uint8 for a limited set of operations)
Regarding the indexing storage for sparse matrices, MATLAB uses the macro mwIndex in their mex API, which is a typedef for an integer class. On 32-bit MATLAB versions it is an int (signed 32-bit integer). On 64-bit MATLAB versions it is a size_t (unsigned 64-bit integer). Having your input arrays to the sparse command as a certain type (assuming the type is even supported) will have absolutely no effect on how those indexes are actually stored inside the MATLAB variable.
@John: Yes, but, in principle, if one knows one's matrix is not larger than 2^32 rows and columns, coding them as uint32 would save 33% of space when storing the triplets (8 (val) +4+4 bytes per entry instead of 8+8+8), which seems fairly serious to me. Anyway, I understand this may not happen so often.
See this related link (some of which I repeat below for convenience):
The minimum data storage requirement formula for a double m x n sparse matrix with nnz non-zero elements, including the index data, is as follows on a 32-bit system:
bytes = max(nnz,1) * (4 + 8) + (n+1)*4
Which breaks down as follows:
nnz * 4 = Storing the row index of the non-zero elements
nnz * 8 = Storing the non-zero double element values themselves
(n+1)*4 = Storing the cumulative number of non-zero elements thru column
Each row value and index gets stored, plus the cumulative column index data gets stored.
For 64-bit systems using 8-byte integers for the indexing you can replace the 4's above with 8's. So, in theory on a 64-bit system, moving from 8-byte index integers to 4-byte index integers would be more like a 25% savings (the 8+8 would become 4+8) assuming the nnz part is dominating the calculation.
This is just the minimum requirements. A sparse matrix can have excess memory allocated beyond the minimum if desired.

Sign in to comment.

More Answers (0)

Asked:

on 25 Jul 2016

Edited:

on 26 Jul 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!