Ridge regression coefficient question

I'm confused about how ridge regression coefficients are generated in matlab. Any help would be appreciated. An example of the issue is shown below.
Thanks,
JG
N = 200;
p = 30;
y = rand(N,1);
X = [ones(N,1),rand(N,p)];
lambda = 1;
R = X'*X + lambda*eye(size(X,2));
Rinv = inv(R);
b_ridge = Rinv*X'*y;
y_ridge = X*b_ridge;
XX = X(:,2:end);
b_ridge_matlab = ridge(y,XX,lambda,0);
y_ridge_matlab = X*b_ridge_matlab;
% why are b_ridge and b_ridge_matlab different? I thought that
%the 0 option in ridge eliminated all scaling and was useful for
%prediction (i.e., y_pred = X_new*b).

 Accepted Answer

Good question! This took a while to figure out, and I can see the help text is not clear about it. The calculations are actually always based on a scaled X under the hood, but the results are adjusted later to be usable with the unscaled data. In particular, the ridge parameter is interpreted as applying to the scaled data. You can reproduce the ridge results by computing R in your code as follows:
R = X'*X + lambda*diag(var(X));

3 Comments

Hi Tom,
Thank you for your response. Yes, the modified definition of R gives results consistent with Matlab. However, this points to a broader question: why would I want to use a set of coefficients for prediction which are not defined according to the standard definition (i.e., that which appears in Matlab's documentation: beta_hat = inv(X'*X + lambda*eye(p+1))*X'*y). Maybe I don't understand enough about ridge regression generally or maybe the coefficients coming from b_ridge_matlab = ridge(y,XX,lambda,0) are to be used with some special prediction routine and not just y_ridge_matlab = X*b_ridge_matlab;
Thanks again.
JG
I agree the help text is confusing. The definition you quote is accurate when X is scaled. I think the alternative with the "0" flag ought to be described as presenting the ridge coefficients, computed the same way, but then post-processed so they can be used with the original X variables. Unless I misunderstand, they do serve that purpose. Try changing your script to include a real relationship between X and y, and at the end plot the fitted and observed values:
y = X*(5./(1:31)')+rand(N,1);
...
scatter(y_ridge_matlab,y)
Thanks Tom

Sign in to comment.

More Answers (0)

Asked:

on 16 Feb 2012

Edited:

on 16 Oct 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!