**You are now following this question**

- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.

# Inconsistant Accuracy , Precision , Recall and F1 Score

5 views (last 30 days)

Show older comments

Hi there,

I'm attempting to calculate Precision, Recall, and F1 score, but I see NaN in my calculations. Could you please assist me in resolving a computation bug?

Thank you

act =[

0

0.0269

0.0354

0

51.9578

34.4936

10.0596

4.2331

3.4373

2.0611

2.9576

3.1177

1.9092

1.8423

5.5713

4.7685

3.8489

3.8223

52.2738

26.4217];

pred = -0 + (100+0)*rand(size(act,1),1); %between a = 0 to b = 100

[m,order] = confusionmat(act,pred);

cmtx = m';

diagonal = diag(cmtx); % Bug here

SumOfrows = sum(cmtx,2);

precision = diagonal ./ SumOfrows;

OverallPrecision = mean(precision);

SumOfCol = sum(cmtx,1);

recall = diagonal ./ SumOfCol';

OverallRecall = mean(recall);

f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);

precision,

precision = 39×1

NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN

OverallPrecision,

OverallPrecision = NaN

recall,

recall = 39×1

0
0
0
0
0
0
0
0
0
0

OverallRecall,

OverallRecall = NaN

f1Score

f1Score = NaN

##### 20 Comments

dpb
on 20 Jun 2023

No idea why you would consider it a "bug"; your input vector of predicted values is totally bogus in terms of the actual data vector you used -- there's no chance in about the number of stars in the galaxy that any value at all will match and thereby return any match on the diagonal.

Secondarily, when the range is normalized, the input size of 23 elements gets expanded to 39 and there are then quite a number of rows/columns that contain no hits in them; when those are summed, they're still zero and dividing by zero creates NaN.

You could get rid o the NaN by removing all zero rows/columns from m first, but it's still nonsensical given the input.

the cyclist
on 20 Jun 2023

I am sorry if this sounds harsh, but there is so much wrong here that it is difficult to know how to advise you.

Here is the biggest, overarching problem. You are using a method (precision, recall, F1 score based on confusion matrix) that is based on the predicion of categories of data, but are applying it to continuous measurements. (You have also used random predictions of those continuous values, but I assume you built a bad model on purpose?)

@dpb has explained technically why you are seeing nonsense output. But why conceptually are you trying to do it this way?

Life is Wonderful
on 21 Jun 2023

Edited: Life is Wonderful
on 21 Jun 2023

Thank you - Since input prediction was incorrect, and the subsequent calculation was discarded. Could you perhaps recommend a strategy for me that improves prediction while being robust?

Please bear in mind that I do not have DL/ NN toolbox support for creating predictions on actual data sets that use pixel motion vectors.

@the cyclist, Thank you - I understand the constraint, therefore please assist with your suggestion.

Thank you very much!!

the cyclist
on 21 Jun 2023

In a typical prediction application, you would have a set of explanatory variables, that are used to predict a response variable. One tries to find a mathematical model that gives the best prediction.

You have only shown us some actual values, and a very strange random guess at the prediction. You don't mention having any other variables that could be used to predict the response.

Life is Wonderful
on 21 Jun 2023

Edited: Life is Wonderful
on 21 Jun 2023

The problem i am see is which method suite the prediction value ?

When I use the fitlm output feed to forecast function, the f1 score is always NaN. Is it the range that's the issue, or is there anything else I'm missing?

In the event that I replace, I replace actual values with random numbers and treat the prediction value as random.

take the example here (Good case -1 vs Bad case -2 )

% case -1, OK case , I am puzzle with f1 score exceeding 1 when random

% number are smaller

fprintf('%20s|%20s|%20s|\n--------------------+--------------------+--------------------+\n', ...

'OverallPrecision','OverallRecall','f1Score');

OverallPrecision| OverallRecall| f1Score|
--------------------+--------------------+--------------------+

t_mse = randi([0 1e2],1e3,1);% max value was originally 2 now 1e2

pred = randi([0 1e2],1e3,1); % max value was originally 2 now 1e2

[m,order] = confusionmat(t_mse,pred);

cm =m;

cmtx = cm';

diagonal = diag(cmtx);

SumOfrows = sum(cmtx,2);

precision = diagonal ./ SumOfrows;

OverallPrecision = mean(precision);

SumOfCol = sum(cmtx,1);

recall = diagonal ./ SumOfCol';

OverallRecall = mean(recall);

f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);

fprintf('%20.8f|%20.8f|%20.8f|\n',OverallPrecision,OverallRecall,f1Score);

0.00446723| 0.00351890| 0.01407562|

% case -2 , NOT OK

mdl = fitlm(1:size(t_mse,1),t_mse,'RobustOpts','bisquare');

ypred = predict(mdl,t_mse);

[m,order] = confusionmat(t_mse,ypred);

cm =m;

cmtx = cm';

diagonal = diag(cmtx);

SumOfrows = sum(cmtx,2);

precision = diagonal ./ SumOfrows;

OverallPrecision = mean(precision);

SumOfCol = sum(cmtx,1);

recall = diagonal ./ SumOfCol';

OverallRecall = mean(recall);

f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);

fprintf('%20.8f|%20.8f|%20.8f|\n',OverallPrecision,OverallRecall,f1Score);

NaN| NaN| NaN|

dpb
on 21 Jun 2023

As @the cyclist so appropriately pointed out (what I neglected to mention in simply showing you why the calculations resulted in NaN), the whole concept of what you're doing is simply a wrongheaded approach to whatever it is that is the underlying problem -- applying a categorical method to continuous data is simply incorrect and it doesn't matter whether the actual numbers returned are finite or not; they're meaningless either way.

The only hope here is to go back to the basics of what is the actual problem trying to solve (that is, the research question or hypothesis to be tested, NOT just the attempt to apply a confusion matrix to continuous data somehow to create finite number) and develop a clear statement of the problem. THEN one can begin to assess what would be appropriate analysis techniques, but it's impossible to address the fundamental problem here with no understanding of what the data are, how were obtained and what it is that is attempted to be inferred from them. As noted, you have shown no corollary variable(s) from which to even begin to try to make some sort of predictive model; predicting a correlation from random variables is a nonstarter from the beginning; ain't agonna' happen.

the cyclist
on 21 Jun 2023

Again, I don't mean to sound harsh, but you seem to be picking MATLAB functions without any understanding of what they actually do. It is also extremely confusing why you keep creating random values for comparisons. Why are you not using your actual data? The predictions should never be random.

The first, most important thing you do not seem to understand is the difference between trying to predict values that are in categories (e.g. "red", "green"), versus continuous values (e.g. 1.23, 3.45, 7.77). In technical terms, are you trying to solve a classification problem, or a regression problem? This is important for both the method used to make the prediction, and for the method used assess the quality of that prediction.

Both of the example cases you show are problematic, because they confuse all of the above. They mix up methods intended for binary classification (when your data have more than 2 categories) with regression methods such as fitlm.

I think it would be helpful for you to read through the two documentation pages I posted. This forum is probably not the best for teaching you all you do not know.

dpb
on 21 Jun 2023

Life is Wonderful
on 21 Jun 2023

OK - Let me admit that I am not very strong at statistical modelling and would like to know how to use the below data.

I'm making available original image data (o_mse) and temporal motion vector data (t_mse). These are the original logs; please be patient and assist me with the accuracy, precision, recall, and f1 score.

Thank you

o_mse =[ 0,

53.0140943319746,

53.0174255213059,

53.0182121794476,

53.0191662962238,

53.0207961599339,

53.0231994839782,

53.0248982971504,

53.0276025441255,

53.0296999351091,

53.0322192093296,

53.033221321686,

53.0371926386068,

53.0374168724086,

53.0374168724086,

53.0361925310774,

53.0365178615195,

53.0381386025013,

53.0450089298561,

53.0458412996819,

53.0472699859074,

53.0081868158045,

53.0187603691157,

53.0292366503645,

53.0298888447649,

53.0335182171649,

53.0301261767046,

53.0306945862575,

53.0359725648643,

53.0373254170595,

53.0379566788641,

53.0369836250416,

53.0413604595606,

53.0413604595606,

53.0413604595606,

53.0416131955181,

53.0416131955181,

53.0420048739545,

53.0423344173259,

53.0426604412793,

53.0432157468736,

53.0326738044732,

53.0326352021884,

53.0367541855715,

53.0367541855715,

53.0367541855715,

53.0367541855715,

53.0377051227673,

53.0380840079377,

53.0381971988997,

53.0383938413165,

53.0586314770791,

53.0615714628554,

53.0615714628554,

53.0615714628554,

53.0615714628554,

53.0615714628554,

53.0615714628554,

53.0615714628554,

53.0613712659631,

53.0615714628554];

t_mse=[ 0,

0.734883686015063,

0.192475991125898,

0.0327014299129151,

0.0195078600810337,

0.0304594622834278,

0.0520938386410102,

0.0211458705022888,

0.0631774038012625,

0.0516649444109554,

0.0582856533974002,

78.7300796990242,

0.181192266114234,

0.00577350803774446,

0,

0.0249932929088307,

0.0203441872971613,

0.0181028555606772,

0.279849527878201,

0.020207244529619,

0.0426865766599146,

78.701662450448,

0.418040866334219,

0.522407636613862,

0.0479860367387668,

0.0905073281583677,

0.429369239613629,

0.0187073024105004,

0.019364943427583,

0.071410618170089,

0.0399294236524782,

78.7449961282335,

0.292824699746113,

0,

0,

0,

0,

0.0215379979322234,

0.00781729938984511,

0.0330821879169216,

0.0263516330559444,

78.7306102961175,

0.11237252592514,

0.272035293886666,

0,

0,

0,

0.0406184916866731,

0,

0.0243201213384746,

0,

78.7573365501741,

0.219304252229747,

0,

0,

0,

0,

0,

0,

0.00912834159920203,

0];

figure;

plot(1:length(o_mse),o_mse,1:length(o_mse),t_mse,'g-x');legend('o-mse','t-mse','Location','best');

xlabel('AbsNum');ylabel('Magnitude');title('Pixel-motion-vector');

the cyclist
on 21 Jun 2023

Thank you for sharing your actual data. That is helpful.

Accuracy, precision, recall, and f1 score are not sensible metrics for these data, in any way that I can imagine. Why do you believe those metrics are meaningful for your problem?

Forget those metrics (for now). Forget prediction, models, etc.

What question are you trying to answer about these data?

Life is Wonderful
on 21 Jun 2023

Edited: Life is Wonderful
on 21 Jun 2023

I feel so because statistical modelling is useful when analysing large datasets and can assist determine how well an algorithm works when tested on a vast amount of data.

Well, I'm attempting to answer from the video data if I'm confident that no mistakes are made when analysing the video frames.

the cyclist
on 21 Jun 2023

You are correct that statistical modeling might be able to help with your problem. But those particular metrics you are trying to use are not useful.

Are o-mse and t-mse the only data you have about the video? Looking at those data, how would a human being know if there are any mistakes?

dpb
on 21 Jun 2023

Edited: dpb
on 21 Jun 2023

What are "o-mse" and "t-mse", really? It's extremely difficult to conceptualize solutions without a clear understanding of just what the problem is -- and we don't know what those variables really represent.

It would seem highly unlikely to be able to predict anything about whatever an error would have been in "analysing" the video frames from the o-mse data; it's essentially a short ramp to a plateau; there's no info in that.

It's not clear what the t-mse data represent as noted; what does that have to do with the video and some error/mistake in analyzing it?

The thing that is lacking here is any definition of what was/was not a mistake for each frame; without that there's really nothing that can be said about prediction cuz it's unknown what the result was; ergo, how's one going to predict?

Back to @the cyclist's point about particular test statistics, to make this into a dataset for which a categorization metric could be used would be to have some metric that you recorded be an "Error/No Error" binary value for each measurement -- that would be the reference. Then you have the problem of what is available that has any correlation to that metric to use as one or more predictors.

Life is Wonderful
on 22 Jun 2023

Edited: Life is Wonderful
on 22 Jun 2023

o-mse is for original mean square error, and t-mse stands for time vector mean square error.

o-mse is from original frame - input variable , while t-mse stands for flaw generated from o-mse, which means there are artefacts - output variable.

So in my figure, I'm comparing the original and output data. What I want to learn here is that if the input information is trained and we have a predicted value, which I will compare to my algo output, i.e. t_mse value, I will know which direction the motion vector is likely to move and actual movement is observed, which will give me an idea of the error probability statistics.

As you can see from the plots, a periodic spike in green colour indicates a quick change in information.

From the aforementioned analytics data, I believe no additional information is required from a data need standpoint; nonetheless, please let me know your expectations and assist me in determining what type of statistical model you would recommend for me.

Thank you very much!!

the cyclist
on 22 Jun 2023

load("mse.mat","o_mse","t_mse")

figure

hold on

plot(o_mse)

plot(t_mse)

set(gca,"YLim",[52.95 53.10]);

legend(["o\_mse","t\_mse"])

Here I have zoomed way in to observe what o_mse does at the spikes in t_mse. I still don't really understand what you are trying to predict.

Life is Wonderful
on 22 Jun 2023

dpb
on 22 Jun 2023

Nor I. At that resolution, there is at least some variablility in the o variable and there is some variation in its magnitude that does correlate with the spikes, sorta', but it's certainly not consistent in nature; some have a local minimum; others don't. Some have a leading change in slope that could be detected, at least one shows almost nothing in that vein, either...

But

Step 1. No idea because still never explained what the output is

Step 2. See Step 1.

Life is Wonderful
on 23 Jun 2023

Edited: Life is Wonderful
on 23 Jun 2023

At that resolution, there is at least some variablility in the o variable and there is some variation in its magnitude that does correlate with the spikes

Does this sophisticated statistical prediction technique fail to detect a little movement, implying that data analysis for a given input should provide a very close by forecasted movement, implying that accuracy will be much greater, error will be at its lowest, and F1 score will be closer to 1?

At the same time, please bear with me - Because the data I have taken is for numerous video frames, and for a very good reason, you can see at point 20, original mean square error (o_mse) and spike (t_mse),

You say you can't describe it because of this. I'm not sure if changing a spike'sorta' by a magnitude of x will do the trick is predicted.

Specifically, I'm looking for a perceptible change in the original that results in the object movement being recorded by the pixel movement and being captured as an artefact in fnum=12 from the additional photo. please see the attached picture

I am sure you can mimic a pseudo implementation and recommend an option I can test out.

Thank you

A new set of data I tapped , if think the new data is helpful, I can add here , for reference purposes - I am adding the data plots

dpb
on 23 Jun 2023

Edited: dpb
on 23 Jun 2023

Well, that's certainly a different result; those two have almost perfect correlation.

However, as to what can be inferred from the above and what sort of an analysis to apply is far too complex a question to answer in this forum; would need in-depth consultation in order to have a sufficient understanding of the problem in order to be able to do so.

Good luck in finding someone in your environment that has such expertise...

### Answers (0)

### See Also

### Categories

### Tags

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!**An Error Occurred**

Unable to complete the action because of changes made to the page. Reload the page to see its updated state.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Americas

- América Latina (Español)
- Canada (English)
- United States (English)

Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)

Asia Pacific

- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)