LASSO does not return correct coefficient estimate when Intercept is set as false and the first column of X is all 1

5 views (last 30 days)
Salil Koner
Salil Koner on 19 Nov 2021
Answered: Kumar Pallav on 22 Nov 2021
Hi,
I have been trying the correctness of Lasso function in MATLAB to see whether it produces the correct result for different choices of the parameter values. It turns out that for the linear model , if I pre-construct the entire design matrix, i.e. and feed the full design matrix in the lasso function and force the intercept to false, then it does not produce the correct coefficient estimate corresponding to the column of 's. No matter high is the value of true , it always returns 0 for the estimate of . However, the estimates of β looks fine to me. Is there anything I am missing or it is a bug that need to fixed?
n = 100; % Sample size
p = 5; % Number of predictors
X = normrnd(0,1,[n p]); % n by p design matrix
Xstar = [ones(nn, 1) X]; % padding the column of 1's with X
beta = normrnd(0,1,[p 1]); % true coefficient corresponding to X
beta_0 = 500; % true intercept, very high; Should not kill it to zero.
Y = Xstar * [beta_0 ; beta] + normrnd(0,1,[n 1]); % response
lambda = 0.05; % Very small
% Fitting an intercept model: Feeding X in the function and let the
% function to include a column of 1's by specifying Intercept = true
[betahat_intercept, fitinfo_intercept] = lasso(X , Y , 'Lambda', lambda, 'Intercept', true);
% Fitting a no-intercept model by specifying Intercept = false. But now
% instead of feeding X, I am feeding Xstar (which includes the column for intercept
[betahat_nointercept, fitinfo_nointercept] = lasso(Xstar , Y , 'Lambda', lambda, 'Intercept', false);
fprintf("The estimated intercept for intercept model is %f\n", fitinfo_intercept.Intercept);
fprintf("The estimated intercept for no-intercept model is %f\n", betahat_nointercept(1));

Answers (1)

Kumar Pallav
Kumar Pallav on 22 Nov 2021
Hi,
When the 'Intercept' is set to false, then the returned intercept value is 0.
If you run the following:
[betahat_intercept, fitinfo_intercept] = lasso(X , Y , 'Lambda', lambda, 'Intercept', true);
[betahat_nointercept, fitinfo_nointercept] = lasso(X , Y , 'Lambda', lambda, 'Intercept', false);
The output Y can be thought of as following for a single row of data:
y= 500 + w1*x1 + w2*x2+....+w5*x5 +noise
Now, if 'intercept' is set to false,intercept value is 0 and it is adjusted in the weights of w1, w2,..w5.
If 'intercept' is set to true, you get an intercept value of around 500, and the weights are adjusted accordingly.
Refer this for more on lasso.
Hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!