# Confidence interval on dependent variable obtained through two consecutive linear regressions

1 view (last 30 days)
FastCar on 18 Jun 2020
Commented: the cyclist on 18 Jun 2020
Dear all,
I have an independent variable vector x and two dependent variable y1 and y2.
y1 is given by
y1 = a * x + b
and a and b are given by a linear regression, thus I have the standard deviation on both the parameters.
y2 is given by
y2 = c * y1 + d
and c and d are given by a linear regression, thus I have the standard deviation on both the parameters.
I would like to compute the confidence interval for the variable y2 expressed as a function of the variable x.
the cyclist on 18 Jun 2020
It gets complicated quickly.
One reason why is that when you do an ordinary linear regression of the form
y = a * x + b;
one of the assumptions is that there is no error in the measure of x (or at least negligible). So, the second regression is explicitly violating that assumption, when you say there is error in y1 (that carries over to the estimation of the parameters for y2). You will not have a valid estimate of c and d.
Technically, you should do a Deming regression (or some other errors-in-variables model) for the second regression, unless the error in y1 happens to be very small compared to the variability of y2.
Another reason is that you don't know (or at least haven't specified) whether y2 potentially has dependence directly on x, in addition to the dependence mediated by y1.
So, I do think it is probably possible to derive what the relationships are amongst the uncertainties in the parameters of the regression, but one would need to map out a bunch of assumptions first.
It's certainly not something I would expect to see in a built-in MATLAB function, as you hoped.