Inconsistant Accuracy , Precision , Recall and F1 Score

Question

Life is Wonderful on 20 Jun 2023

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1985854-inconsistant-accuracy-precision-recall-and-f1-score

Commented: Life is Wonderful on 24 Jun 2023

Hi there,

I'm attempting to calculate Precision, Recall, and F1 score, but I see NaN in my calculations. Could you please assist me in resolving a computation bug?

Thank you

act =[
    
0
0.0269
0.0354
0
51.9578
34.4936
10.0596
4.2331
3.4373
2.0611
2.9576
3.1177
1.9092
1.8423
5.5713
4.7685
3.8489
3.8223
52.2738
26.4217];
pred = -0 + (100+0)*rand(size(act,1),1); %between a = 0 to b = 100
[m,order] = confusionmat(act,pred);
cmtx = m';
diagonal = diag(cmtx); % Bug here 
SumOfrows = sum(cmtx,2);
precision = diagonal ./ SumOfrows;
OverallPrecision = mean(precision);
SumOfCol = sum(cmtx,1);
recall = diagonal ./ SumOfCol';
OverallRecall = mean(recall);
f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);
precision,
precision = 39×1
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
OverallPrecision,
OverallPrecision = NaN
recall,
recall = 39×1
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
OverallRecall,
OverallRecall = NaN
f1Score
f1Score = NaN

20 Comments
Show 18 older commentsHide 18 older comments

Life is Wonderful on 21 Jun 2023

Edited: Life is Wonderful on 21 Jun 2023

Open in MATLAB Online

@the cyclist

The problem i am see is which method suite the prediction value ?

When I use the fitlm output feed to forecast function, the f1 score is always NaN. Is it the range that's the issue, or is there anything else I'm missing?

In the event that I replace, I replace actual values with random numbers and treat the prediction value as random.

take the example here (Good case -1 vs Bad case -2 )

% case -1, OK case , I am puzzle with f1 score exceeding 1 when random
% number are smaller 
fprintf('%20s|%20s|%20s|\n--------------------+--------------------+--------------------+\n', ...
    'OverallPrecision','OverallRecall','f1Score');
    OverallPrecision|       OverallRecall|             f1Score|
--------------------+--------------------+--------------------+
t_mse = randi([0 1e2],1e3,1);% max value was originally 2 now 1e2
pred = randi([0 1e2],1e3,1); % max value was originally 2 now 1e2
[m,order] = confusionmat(t_mse,pred);
cm =m;
cmtx = cm';
diagonal = diag(cmtx);
SumOfrows = sum(cmtx,2);
precision = diagonal ./ SumOfrows;
OverallPrecision = mean(precision);
SumOfCol = sum(cmtx,1);
recall = diagonal ./ SumOfCol';
OverallRecall = mean(recall);
f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);
fprintf('%20.8f|%20.8f|%20.8f|\n',OverallPrecision,OverallRecall,f1Score);
          0.00446723|          0.00351890|          0.01407562|
% case -2 , NOT OK
mdl =  fitlm(1:size(t_mse,1),t_mse,'RobustOpts','bisquare');
ypred =  predict(mdl,t_mse);
[m,order] = confusionmat(t_mse,ypred);
cm =m;
cmtx = cm';
diagonal = diag(cmtx);
SumOfrows = sum(cmtx,2);
precision = diagonal ./ SumOfrows;
OverallPrecision = mean(precision);
SumOfCol = sum(cmtx,1);
recall = diagonal ./ SumOfCol';
OverallRecall = mean(recall);
f1Score = 2* ((OverallPrecision*OverallRecall)/ OverallPrecision+OverallRecall);
fprintf('%20.8f|%20.8f|%20.8f|\n',OverallPrecision,OverallRecall,f1Score);
                 NaN|                 NaN|                 NaN|

dpb on 21 Jun 2023

As @the cyclist so appropriately pointed out (what I neglected to mention in simply showing you why the calculations resulted in NaN), the whole concept of what you're doing is simply a wrongheaded approach to whatever it is that is the underlying problem -- applying a categorical method to continuous data is simply incorrect and it doesn't matter whether the actual numbers returned are finite or not; they're meaningless either way.

The only hope here is to go back to the basics of what is the actual problem trying to solve (that is, the research question or hypothesis to be tested, NOT just the attempt to apply a confusion matrix to continuous data somehow to create finite number) and develop a clear statement of the problem. THEN one can begin to assess what would be appropriate analysis techniques, but it's impossible to address the fundamental problem here with no understanding of what the data are, how were obtained and what it is that is attempted to be inferred from them. As noted, you have shown no corollary variable(s) from which to even begin to try to make some sort of predictive model; predicting a correlation from random variables is a nonstarter from the beginning; ain't agonna' happen.

the cyclist on 21 Jun 2023

Again, I don't mean to sound harsh, but you seem to be picking MATLAB functions without any understanding of what they actually do. It is also extremely confusing why you keep creating random values for comparisons. Why are you not using your actual data? The predictions should never be random.

The first, most important thing you do not seem to understand is the difference between trying to predict values that are in categories (e.g. "red", "green"), versus continuous values (e.g. 1.23, 3.45, 7.77). In technical terms, are you trying to solve a classification problem, or a regression problem? This is important for both the method used to make the prediction, and for the method used assess the quality of that prediction.

Both of the example cases you show are problematic, because they confuse all of the above. They mix up methods intended for binary classification (when your data have more than 2 categories) with regression methods such as fitlm.

I think it would be helpful for you to read through the two documentation pages I posted. This forum is probably not the best for teaching you all you do not know.

Life is Wonderful on 21 Jun 2023

Open in MATLAB Online

OK - Let me admit that I am not very strong at statistical modelling and would like to know how to use the below data.

I'm making available original image data (o_mse) and temporal motion vector data (t_mse). These are the original logs; please be patient and assist me with the accuracy, precision, recall, and f1 score.

Thank you

o_mse =[                 0, 
0140943319746, 
0174255213059, 
0182121794476, 
0191662962238, 
0207961599339, 
0231994839782, 
0248982971504, 
0276025441255, 
0296999351091, 
0322192093296, 
033221321686, 
0371926386068, 
0374168724086, 
0374168724086, 
0361925310774, 
0365178615195, 
0381386025013, 
0450089298561, 
0458412996819, 
0472699859074, 
0081868158045, 
0187603691157, 
0292366503645, 
0298888447649, 
0335182171649, 
0301261767046, 
0306945862575, 
0359725648643, 
0373254170595, 
0379566788641, 
0369836250416, 
0413604595606, 
0413604595606, 
0413604595606, 
0416131955181, 
0416131955181, 
0420048739545, 
0423344173259, 
0426604412793, 
0432157468736, 
0326738044732, 
0326352021884, 
0367541855715, 
0367541855715, 
0367541855715, 
0367541855715, 
0377051227673, 
0380840079377, 
0381971988997, 
0383938413165, 
0586314770791, 
0615714628554, 
0615714628554, 
0615714628554, 
0615714628554, 
0615714628554, 
0615714628554, 
0615714628554, 
0613712659631, 
0615714628554];

t_mse=[                 0,
        0.734883686015063,
        0.192475991125898,
       0.0327014299129151,
       0.0195078600810337,
       0.0304594622834278,
       0.0520938386410102,
       0.0211458705022888,
       0.0631774038012625,
       0.0516649444109554,
       0.0582856533974002,
         78.7300796990242,
        0.181192266114234,
      0.00577350803774446,
                        0,
       0.0249932929088307,
       0.0203441872971613,
       0.0181028555606772,
        0.279849527878201,
        0.020207244529619,
       0.0426865766599146,
          78.701662450448,
        0.418040866334219,
        0.522407636613862,
       0.0479860367387668,
       0.0905073281583677,
        0.429369239613629,
       0.0187073024105004,
        0.019364943427583,
        0.071410618170089,
       0.0399294236524782,
         78.7449961282335,
        0.292824699746113,
                        0,
                        0,
                        0,
                        0,
       0.0215379979322234,
      0.00781729938984511,
       0.0330821879169216,
       0.0263516330559444,
         78.7306102961175,
         0.11237252592514,
        0.272035293886666,
                        0,
                        0,
                        0,
       0.0406184916866731,
                        0,
       0.0243201213384746,
                        0,
         78.7573365501741,
        0.219304252229747,
                        0,
                        0,
                        0,
                        0,
                        0,
                        0, 
      0.00912834159920203, 
                       0];

figure;

plot(1:length(o_mse),o_mse,1:length(o_mse),t_mse,'g-x');legend('o-mse','t-mse','Location','best');

xlabel('AbsNum');ylabel('Magnitude');title('Pixel-motion-vector');

dpb on 21 Jun 2023

Edited: dpb on 21 Jun 2023

What are "o-mse" and "t-mse", really? It's extremely difficult to conceptualize solutions without a clear understanding of just what the problem is -- and we don't know what those variables really represent.

It would seem highly unlikely to be able to predict anything about whatever an error would have been in "analysing" the video frames from the o-mse data; it's essentially a short ramp to a plateau; there's no info in that.

It's not clear what the t-mse data represent as noted; what does that have to do with the video and some error/mistake in analyzing it?

The thing that is lacking here is any definition of what was/was not a mistake for each frame; without that there's really nothing that can be said about prediction cuz it's unknown what the result was; ergo, how's one going to predict?

Back to @the cyclist's point about particular test statistics, to make this into a dataset for which a categorization metric could be used would be to have some metric that you recorded be an "Error/No Error" binary value for each measurement -- that would be the reference. Then you have the problem of what is available that has any correlation to that metric to use as one or more predictors.

Life is Wonderful on 22 Jun 2023

Edited: Life is Wonderful on 22 Jun 2023

Hi @dpb @the cyclist

o-mse is for original mean square error, and t-mse stands for time vector mean square error.

o-mse is from original frame - input variable , while t-mse stands for flaw generated from o-mse, which means there are artefacts - output variable.

So in my figure, I'm comparing the original and output data. What I want to learn here is that if the input information is trained and we have a predicted value, which I will compare to my algo output, i.e. t_mse value, I will know which direction the motion vector is likely to move and actual movement is observed, which will give me an idea of the error probability statistics.

As you can see from the plots, a periodic spike in green colour indicates a quick change in information.

From the aforementioned analytics data, I believe no additional information is required from a data need standpoint; nonetheless, please let me know your expectations and assist me in determining what type of statistical model you would recommend for me.

Thank you very much!!

Life is Wonderful on 23 Jun 2023

Edited: Life is Wonderful on 23 Jun 2023

Open in MATLAB Online

At that resolution, there is at least some variablility in the o variable and there is some variation in its magnitude that does correlate with the spikes

Does this sophisticated statistical prediction technique fail to detect a little movement, implying that data analysis for a given input should provide a very close by forecasted movement, implying that accuracy will be much greater, error will be at its lowest, and F1 score will be closer to 1?

At the same time, please bear with me - Because the data I have taken is for numerous video frames, and for a very good reason, you can see at point 20, original mean square error (o_mse) and spike (t_mse),

You say you can't describe it because of this. I'm not sure if changing a spike'sorta' by a magnitude of x will do the trick is predicted.

Specifically, I'm looking for a perceptible change in the original that results in the object movement being recorded by the pixel movement and being captured as an artefact in fnum=12 from the additional photo. please see the attached picture

I am sure you can mimic a pseudo implementation and recommend an option I can test out.

Thank you

A new set of data I tapped , if think the new data is helpful, I can add here , for reference purposes - I am adding the data plots

dpb on 23 Jun 2023

Edited: dpb on 23 Jun 2023

Well, that's certainly a different result; those two have almost perfect correlation.

However, as to what can be inferred from the above and what sort of an analysis to apply is far too complex a question to answer in this forum; would need in-depth consultation in order to have a sufficient understanding of the problem in order to be able to do so.

Good luck in finding someone in your environment that has such expertise...

Life is Wonderful on 24 Jun 2023