# Single precision matrix multiplication

17 views (last 30 days)
Donald Liu on 12 Dec 2019
Edited: Matt J on 12 Dec 2019
Script:
M = 10000; N = 100;
A = round(10000 * complex(randn(M, N), randn(M, N)));
B = single(A') * single(A);
C = B' - B;
max(abs(C(:))) %// ==> answer is non-zero
A = single(A);
B = A' * A;
C = B' - B;
max(abs(C(:))) %// ==> answer is zero
Not sure what caused the difference? Thanks!

Matt J on 12 Dec 2019
Edited: Matt J on 12 Dec 2019
The Matlab interpreter runs a special dedicated "ctranspose-multiply" function to evaluate expressions of the form,
P'*Q
It does not do any explicit transposition of P, thus saving time and memory. This special function likely does a pre-check for the special case when P and Q are the same variable. In this case, it knows that it only has to compute the lower (or upper) triangular part of the resulting matrix, and can just copy the conjugate of that to the upper(lower) triangle. Thus, the output of A'*A will always be perfectly Hermitian.
However, for the expression,
single(P')*single(Q)
no special functions are used. Every operation in this expression (single,ctranpose,mtimes) is done separately and explicitly, and can generate floating point errors that may not be Hermitian across the final matrix.

James Tursa on 12 Dec 2019
Yes, but "A'*A will always be perfectly symmetric" should instead read "A'*A will always be perfectly Hermitian". And the triangular copy is a conjugate copy, not a straight copy.
Matt J on 12 Dec 2019
Right, you are! Fixed.

James Tursa on 12 Dec 2019
Edited: James Tursa on 12 Dec 2019
To illustrate what Matt is saying, a simple timing test:
>> format longg
>> S = round(10000*single(rand(5000)+rand(5000)*1i));
>> St = S';
>> tic;SaS=S'*S;toc
Elapsed time is 1.675010 seconds.
>> tic;StS=St*S;toc
Elapsed time is 2.942037 seconds.
>> isequal(SaS,StS)
ans =
logical
0
>> isequal(SaS',SaS)
ans =
logical
1
>> isequal(StS',StS)
ans =
logical
0
The difference in timing is because the S'*S method only does about half the multiplication work with some added conjugate copying. And since the off-diagonal elements are the result of a copy, you get an exact Hermitian result for this case. Not so for the generic multiply case ... more on that below.
Another thing to note is that even though the individual elements are made of up integers, the intermediate sums overflow the precision of a single precision variable. Hence the difference in the results depending on the order of calculations even though the result is also made up of exact integers. E.g.,
>> SaS(1,1)
ans =
single
3.353673e+11
>> StS(1,1)
ans =
single
3.353673e+11 + 3408i
>> dot(S(:,1),S(:,1))
ans =
single
3.353671e+11
>> S(:,1)'*S(:,1)
ans =
single
3.353672e+11
Four different answers for the (1,1) spot depending on how we do the calculation.
Now lower the integer values so the intermediate sums do not overflow the precision of a single precision variable:
>> S = round(10*single(rand(5000)+rand(5000)*1i));
>> St = S';
>> tic;SaS=S'*S;toc
Elapsed time is 1.807634 seconds.
>> tic;StS=St*S;toc
Elapsed time is 2.977281 seconds.
>> isequal(SaS,StS)
ans =
logical
1
>> SaS(1,1)
ans =
single
329958
>> StS(1,1)
ans =
single
329958
>> dot(S(:,1),S(:,1))
ans =
single
329958
>> S(:,1)'*S(:,1)
ans =
single
329958
Here we get exactly the same results for the four different methods because the sums didn't overflow the precision of single precision.

Donald Liu on 12 Dec 2019
In single(A), we're converting element a + j * b to single precision a1 + j * b1.
In single(A'), the same element becomes a + j * (-b) and it is converted to single precision yielding a2 + j * (-b2).
Due to rounding differences of positive and negative numbers, b1 and b2 may not be equal. Thus single(A') * single(A) is not strictly Hermitian.
However, even single(A)' * single(A) is not strictly Hermitian, which is still puzzling.
Thanks!

Show 1 older comment
Donald Liu on 12 Dec 2019
Since each element has integral real and imaginary parts, there should be no rounding errors?
James Tursa on 12 Dec 2019
You do the rounding before you do the multiply, so all the downstream calculations start with exactly the same values. The rounding you did has no effect on this. If everything starts with integral values, it won't matter what order you do the calculations AS LONG AS you don't overflow the precision of the variable type you are using. But in your case you did overflow this, hence the differences.
Matt J on 12 Dec 2019
However, even single(A)' * single(A) is not strictly Hermitian, which is still puzzling.
Note that the first single(A) in the expression will not occupy the same memory address as the second single(A). Therefore, Matlab's ctranpose-multiply routine might not be recognizing them as the same matrix. Example:
>> format debug; A=rand(3); B=single(A), C=single(A),
B =
3×3 single matrix
m = 3
n = 3
pr = 12ebdd7e0
0.7265 0.2352 0.7033
0.6667 0.6863 0.5338
0.0327 0.2811 0.3319
C =
3×3 single matrix
m = 3
n = 3
pr = 12eb5dae0
0.7265 0.2352 0.7033
0.6667 0.6863 0.5338
0.0327 0.2811 0.3319
By contrast, in the following, Q and A do occupy the same memory address
M = 10000; N = 100;
A = round(10000 * complex(randn(M, N), randn(M, N)));
A = single(A);
Q=A;
B = Q' * A;
C = B' - B;
max(abs(C(:)))
ans =
single
0