# Diffrence between RMSE selfcalculated and RMSE calculated with Statistics Toolbox

8 views (last 30 days)
Rica on 5 Aug 2016
Commented: Sairam Seshapalli on 12 Nov 2019
Hi all, i calculated the RMSE of these Data:
Y_hat=[
9.774614325191857
9.453084986417043
9.502166049524247
7.817755496590051
7.031233831915310
8.392026077578970
6.881255539731626
6.488927374899896
6.779374282307657
6.474790314047517
13.842988631876649
13.113764172190285
14.244292841981128
12.470726075747763]
Y=[
8.900000000000000
8.600000000000000
9.167000000000000
7.000000000000000
7.030000000000000
7.270000000000000
7.430000000000000
7.270000000000000
7.370000000000000
7.030000000000000
15.029999999999999
13.170000000000000
13.369999999999999
13.630000000000001]
my calculation is based on the Formula: RMSE= sqrt(mean((Y_hat-Y)^2)). with the calculation i got RMSE=0.7894.
But with the Statistics Toolbox of matlab I got RMSE=0.885 which is the sqrt of my calculated Value!!!. Who is wrong: I or the Toolbox??
Thank you!
Rica on 8 Aug 2016
Edited: Rica on 8 Aug 2016
Hi, Thanks for the comments. I made a multivariate regression wit these Parameters X1 and X2. the function fitlm calculates the regression coeffitionts, r^2 and rmse.
% X1=1.0e+02 *[
4.794100000000000
4.830800000000000
5.043100000000000
4.059800000000000
3.179700000000000
4.608300000000000
3.795500000000000
3.299600000000000
3.431000000000000
3.635300000000000
8.896799999999999
8.344199999999999
8.839100000000000
5.600200000000000
]
%
X2=[33.979999999999997
32.450000000000003
32.310000000000002
26.309999999999999
24.230000000000000
27.989999999999998
22.489999999999998
21.550000000000001
22.649999999999999
20.910000000000000
45.509999999999998
43.130000000000003
47.439999999999998
44.899999999999999]
the Result is Y_hat= 0.5262+0.003757*X1+0.21916*X2. the Code is:
%
X_f=[ones(size(X1)) X1 X2];
X_f_lm=X_f(:,2:end);
mdl=fitlm(X_f_lm,Y,'linear').
I got this:

the cyclist on 8 Aug 2016
You calculated the RMSE incorrectly -- and then had a remarkable numerical coincidence.
You calculated
RMSE = sqrt(mean(((Y-Y_hat).^2)))
which is equivalent to
RMSE = sqrt(sum(((Y-Y_hat).^2)/N_obs))
where N_obs is the number of observations. (N_obs = 14 in your case.) You got the value RMSE = 0.7847.
But the correct calculation of RMSE divides by the number of degrees of freedom, not the number of observations. The correct RMSE calculation is
RMSE = sqrt(sum(((Y-Y_hat).^2)/(N_obs-rankX)))
where rankX = 3 in your case.
So,
RMSE = sqrt(sum(((Y-Y_hat).^2)/11))
and is equal to 0.8853 (as MATLAB got).
The numerical coincidence, and complete red herring, is that this is very nearly equal to the square root of your incorrect value.
You can see where (the latest version of) MATLAB does the calculation of MSE around lines 1436-1440 of the file LinearModel.
Sairam Seshapalli on 12 Nov 2019
how your going to find rank

### Categories

Find more on Gaussian Process Regression in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!