Semi-solved:

One X2 value was 0 on the nose, exploding the denominator for that instance of the equation.

This still doesn't tell me why nanmedian returned a real number when nanmean returned (-Inf).

47 views (last 30 days)

Show older comments

Vector A is 7627x1 vector with 150 values, 10 of which are negative, and the rest are NaNs.

Calling nanmedian(A) returns a real number value ( 2.7462). Calling nanmean(A) returns " -Inf". Why does nanmedian return a real value and nanmean does not? How do I fix it to get a real value returned from nanmean?

Of interest may be how I am arriving at vector A:

A=(X1./Y1)./(X2./Y2);

I also tried a for loop unsuccessfully. It returned the same results as the non-for loop version above:

for i=1:length(X1)

A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));

end

X1 and X2, are 7627x1, mostly negative real numbers. Y1 and Y2, are 7627x1, all positive real numbers.

Walter Roberson
on 5 Sep 2017

nanmean for a vector x is the same as

t = x(~isnan(x));

result = mean(t)

so if there is an +/- inf in the data then it is not affected by the removal of the nans . mean() of data that includes +/- inf is +/- inf if all of the inf are the same sign and nan otherwise. Only a single +/- inf is needed to have an infinite sum and so an infinite mean.

nanmedian of the vector x is the same as

t1 = x(~isnan(x));

t2 = sort(t1);

if mod(length(t2), 2) == 0

result = 1/2 * (t2(end/2)+t2(end/2+1))

else

result = t2( ceil(end/2) );

end

This will produce +/- inf only if at least half of the values are +/- inf . In the case of a vector with the same number of +inf and -inf and no other values, the result could be nan due to the attempt to average the -inf and +inf that would then be the two central elements. At least half infinite is needed in order for the middle elements after sort to end up being infinite for an infinite result.

John BG
on 5 Sep 2017

Edited: John BG
on 6 Sep 2017

Hi Balsip

1.

while

A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN NaN NaN -20 NaN NaN -3];

mean(A)

median(A)

nanmean(A)

nanmedian(A)

=

NaN

=

NaN

=

-8.1667

=

-6.5000

as expected, mean and median ignore NaN

mean([-12 -1 -5 -8 -20 -3])

=

-8.1667

median([-12 -1 -5 -8 -20 -3])

=

-6.5000

.

2.

however when 1/0

A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN 1/0 NaN -20 NaN NaN -3];

mean(A)

median(A)

nanmean(A)

nanmedian(A)

ans =

NaN

ans =

NaN

ans =

Inf

ans =

-5

or Inf are part of vector A,

A=[NaN NaN -12 NaN NaN NaN NaN -1 -5 -8 NaN NaN Inf NaN -20 NaN NaN -3];

mean(A)

median(A)

nanmean(A)

nanmedian(A)

ans =

NaN

ans =

NaN

ans =

Inf

ans =

-5

.

3.

nanmean takes into account Inf values but nanmedian doesn't. This is because

Y1

and/or

X2

have one or more null values, introducing Inf s elements in A.

A=(X1./Y1)./(X2./Y2);

.

4.

to avoid this, either directly correct Infs to NaNs

A(find(A==Inf))=NaN

or on Y1 and X2, remove their nulls

tol=.000001;

Y1(find(Y1==0))=tol;

X2(find(X2==0))=tol;

Let tol be a really small, small enough so it can be ignored.

5.

It could also be that Y2 and/or X1 elements take Inf values but I assumed from the question that such is not the case, so only one or more elements of Y1 and X2 are null.

6.

Balsip, please note that although

A=(X1./Y1)./(X2./Y2);

and

for i=1:length(X1)

A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));

end

are mathematically the same, the time consumption of the compact expression is 1 order of magnitude better than the for loop

L=1e7;

X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);

X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);

tic

A=(X1./Y1)./(X2./Y2);

toc

Elapsed time is 0.038204 seconds.

>> tic

for i=1:length(X1)

A(i)=(X1(i)./Y1(i))./(X2(i)./Y2(i));

end

toc

Elapsed time is 0.236449 seconds.

.

the operator ./ is optimised against the for loop you attempted to use a possible solution.

.

Balsip

if you find this answer useful would you please be so kind to consider marking my answer as Accepted Answer?

To any other reader, if you find this answer useful please consider clicking on the thumbs-up vote link

thanks in advance

John BG

John BG
on 6 Sep 2017

Edited: John BG
on 6 Sep 2017

happy to help.

Again, consider using the following

A(X2==0)=NaN;

instead of the for loop you have built involving X2 and A, the reason being

L=5e7;

X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);

X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);

A=(X1./Y1)./(X2./Y2);

tic

for i=1:length(A)

if X2(i)==0

A(i)=NaN;

end

end

toc

Elapsed time is 0.277053 seconds.

L=5e7;

X1=randi([1 1e4],1,L);Y1=randi([1 1e4],1,L);

X2=randi([1 1e4],1,L);Y2=randi([1 1e4],1,L);

A=(X1./Y1)./(X2./Y2);

tic

A(X2==0)=NaN;

toc

Elapsed time is 0.072252 seconds.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!