Different R2 from fitlm model from what I get from Ms. Excel linest model and SST not equal to SSE + SSR

5 views (last 30 days)
Hello everyone,
I have created regression model of the form y~x1+x2+x3 using the fitlm function. However, the R2 predicted from the model is very different from what I get from Excel when i fit the same model using the Linest function in excel. Secondly I have noted that the fitlm function in matlab returns SST, SSE, and SST values that do not meet the statistical requirement of "SST = SSE + SSR". What could be the cause of this? Below is part of the code I am using in Matlab.
mdl_SI = fitlm(GNTables{k}(:,end-3:end),'Intercept',false);
ModelCoeff{k} = mdl_SI.Coefficients;
SSE = mdl_SI.SSE
SSR = mdl_SI.SSR
SST = mdl_SI.SST
mdl_SI.Rsquared
Any help will be grately appreciated.
Martin
  1 Comment
dpb
dpb on 10 Oct 2023
Edited: dpb on 11 Oct 2023
load carsmall
X = [Weight,Horsepower,Acceleration];
mdl = fitlm(X,MPG)
mdl =
Linear regression model: y ~ 1 + x1 + x2 + x3 Estimated Coefficients: Estimate SE tStat pValue __________ _________ _________ __________ (Intercept) 47.977 3.8785 12.37 4.8957e-21 x1 -0.0065416 0.0011274 -5.8023 9.8742e-08 x2 -0.042943 0.024313 -1.7663 0.08078 x3 -0.011583 0.19333 -0.059913 0.95236 Number of observations: 93, Error degrees of freedom: 89 Root Mean Squared Error: 4.09 R-squared: 0.752, Adjusted R-Squared: 0.744 F-statistic vs. constant model: 90, p-value = 7.38e-27
[mdl.SSE mdl.SSR mdl.SST mdl.SSE+mdl.SSR]
ans = 1×4
1.0e+03 * 1.4888 4.5160 6.0048 6.0048
mdl.SSE+mdl.SSR==mdl.SST
ans = logical
0
mdl.SST-(mdl.SSE+mdl.SSR)
ans = -9.0949e-13
The difference is in floating point rounding and is not significant either numerically nor statistically.
Now don't fit an intercept term...
mdl = fitlm(X,MPG,'intercept',false)
mdl =
Linear regression model: y ~ x1 + x2 + x3 Estimated Coefficients: Estimate SE tStat pValue __________ _________ ______ __________ x1 -0.0063238 0.0018485 -3.421 0.00094014 x2 0.088238 0.035877 2.4595 0.015826 x3 2.1221 0.14315 14.824 6.9409e-26 Number of observations: 93, Error degrees of freedom: 90 Root Mean Squared Error: 6.71
[mdl.SSE mdl.SSR mdl.SST mdl.SSE+mdl.SSR]
ans = 1×4
1.0e+03 * 4.0484 4.4879 6.0048 8.5364
Now the model doesn't meet the needs of a full linear model -- and you note fitlm doesn't even try to compute the overall statistics including Rsq because they no longer apply.
Likely Excel doesn't pay attention to such niceties if it does return Rsq for such a model if it even can estimate one without the intercept term; I'm not about to go try to dig into that in a MATLAB forum.
(BTW, this thread was a smart link to another I happened to respond to that had no response -- on investigation it seemed worth noting the cause for the record)

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!