How to do discontinuous piecewise linear model fitting?
Show older comments
Hi all,
I am interested in identifying the linear or linear piecewise model that best fits my data (cell activity versus spatial location, see figure). I have tried linear and continuous piecewise models with John D'Errico's SLM. However, it appears that a discontinuous fit (with 2 line segments, hand-drawn red lines) may be more appropriate than a continuous fit (with 3 line segments, black lines below). I did not find an option to implement discontinuous piecewise linear fits with SLM.

My questions are:
1) Is there a simple way I can do discontinuous linear piecewise fits? and
2) Compare model fit of a) linear vs b) continuous linear piecewise vs c) discontinuous linear piecewise models (e.g. with AIC)? Thanks!
Data and code:
x=[-4.5,-4.0,-3.5,-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,7.5,8.0,8.5,9.0,9.5,16.0,16.5];
y=[0.66,0.65,1.39,1.27,1.05,0.90,0.48,1.11,1.42,1.69,1.95,2.00,2.27,3.02,2.37,1.10,0.97,0.95,1.06,1.10,0.74,0.68,0.64,0.53,0.71,1.45,0.77,0.81];
slm = slmengine(x,y,'degree','linear',...
'interiorknots','free','knots',[-8 1.345 4.84 18],'plot','on','minval',0,'maxval',100);
ylim([0 5]);ylabel('Cell activity');
xlim([-7 18]);ylabel('Location');
4 Comments
Bruno Luong
on 15 Aug 2019
"However, it appears that a discontinuous fit (with 2 line segments) may be more appropriate than a continuous fit (with 3 line segments, as below)"
Sorry but to my eye neither looks correctly fit with any small uncertainty.
Bruno Luong
on 15 Aug 2019
Edited: Bruno Luong
on 15 Aug 2019
IMO the "problem" is not due the fact that you impose the model to be continuous, in fact with free knot spline, when the 2 adjadcent knots get closer, and no point is falling stricly in the interval, it just makes a dummy transition, but the left and right model will fit the data independently as if you do not impose the continuity.
The reason that you don't like the fit is that the score (internally compute by SLM) is based on l^2 distance and as you data obviously do not follow a straight line you prefer the red line (but without telling us why), and the black line looks kind of not doing the job.
You should ask John if he can change SLM to other metric than 2-norm to fit your data (it is not clear to be what would be the score function to get the red line).
But again to my eye your data cannot be fitted by 2 lines.
Accepted Answer
More Answers (2)
Bruno Luong
on 16 Aug 2019
Edited: Bruno Luong
on 16 Aug 2019
"Do you have any additional insights or comments on this approach on fitting models?"
Well, I actually have problem to fit your data with just 2 lines. To me there are at least 4 regions with different linear models.
I can't suggest you an automatic way, but at least you can do semi-automatic by chosing the subintervals after running a free-knot splines.
I use here BSFK (very similar to SLM but I know it better)

This graph is obtained from this code
close all
x=[-4.5,-4.0,-3.5,-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,7.5,8.0,8.5,9.0,9.5,16.0,16.5];
y=[0.66,0.65,1.39,1.27,1.05,0.90,0.48,1.11,1.42,1.69,1.95,2.00,2.27,3.02,2.37,1.10,0.97,0.95,1.06,1.10,0.74,0.68,0.64,0.53,0.71,1.45,0.77,0.81];
% https://fr.mathworks.com/matlabcentral/fileexchange/25872-free-knot-spline-approximation
pp = BSFK(x,y,4,20,[],struct('annimation',1,'sigma',0.001));
subintervals = pp.breaks([2 3;
5 6;
9 13;
13 15]); % selct manually the break points
subintervals = subintervals + [0.1 -0.1]; % reduce the intervals by some margin
hold on
for k=1:size(subintervals,1)
% Fit by a line for each interval
b = x >= subintervals(k,1) & x <= subintervals(k,2);
P = polyfit(x(b),y(b),1);
xlines = subintervals(k,:);
ylines = polyval(P,xlines);
plot(xlines,ylines,'k', 'Linewidth', 2);
end
3 Comments
Bruno Luong
on 16 Aug 2019
Edited: Bruno Luong
on 16 Aug 2019
1) How can I deduce the breakpoints from the plot above? I do see some trends from the red line plot, but I'm not sure which suggested breakpoints (in blue dotted lines) are to be selected and which to be ignored.
I use my eye as criteria. Sorry.
2) If my final discontinuous model is linear (4 black lines), is there any benefit to use a 3rd-order 20-knot spline to identify breakpoints (as you did), or should I stick with a lower order spline with less breakpoints?
The discontinuity break points are detected as those with large derivative, meaning the free knots are close to each other. As you data are clearly not a straight lines, I think it is better to use hight order to reduce unwanted artifact due to fitting non-linear data with linear one.
3) Furthermore, would it be right to specify knot positions only for a prior-driven model (tissue boundaries), and to use free knots for identifying the true best fit (ignoring known tissue boundaries)?
Yes, In my BSFK function you can impose fix knots and in the same time free knots in between.
Richard
on 18 Aug 2019
Mathieu NOE
on 1 Feb 2021
hello
and fminsearch to create that code - seems to work finding the optimal values for 2 inner break points
% load x and y data
x=[-4.5,-4.0,-3.5,-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5,7.5,8.0,8.5,9.0,9.5,16.0,16.5];
y=[0.66,0.65,1.39,1.27,1.05,0.90,0.48,1.11,1.42,1.69,1.95,2.00,2.27,3.02,2.37,1.10,0.97,0.95,1.06,1.10,0.74,0.68,0.64,0.53,0.71,1.45,0.77,0.81];
xdata = x(:);
ydata = y(:);
% Init of fminsearch
a = (max(xdata)-min(xdata))/8;
b = a*2;
global xdata ydata
x0 = [a,b];
x = fminsearch(@obj_fun, x0);
a_sol = x(1);
b_sol = x(2);
XI = [min(xdata),a_sol,b_sol,max(xdata)]; % vector of 1-D look-up table "x" points
YI = lsq_lut_piecewise( xdata, ydata, XI ); % obtain vector of 1-D look-up table "y" points
% plot fit
plot(xdata,ydata,'.',XI,YI,'+-')
legend('experimental data (x,y(x))','LUT points (XI,YI)')
title('Piecewise 1-D look-up table least square estimation')
function err = obj_fun(x)
global xdata ydata
XI = [min(xdata),x(1),x(2),max(xdata)]; % vector of 1-D look-up table "x" points
YI = lsq_lut_piecewise( xdata, ydata, XI ); % obtain vector of 1-D look-up table "y" points
Yint = interp1(XI,YI,xdata);
err = sum((Yint-ydata).^2);
end
Categories
Find more on Get Started with Curve Fitting Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


