How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

11 views (last 30 days)
I have a 1001 measuments, in each measurement, there is a curve
y = a*exp(-b*x)
where x and y are vectors containing 8 elements, a and b are scalar parameters I want to fit. The 1001 measurements are independent and I want to get 1001 a and b.
Generally, I would do
% load measured data
% ...
% size(all_xdata) = 1001*8;
% size(all_ydata) = 1001*8;
fun1D = @(x, xdata)x(1)*exp(-x(2)*xdata) % y = a*exp(-b*x)
parfor k = 1:1001
lsqcurvefit(fun1D,x0,all_xdata(k, :), all_ydata(k, :))
end
I learned from here vectorized lsqcurvefit() may give me some speed-up.
However, I am a little confused about how to set the shape of xdata and ydata with vectorized lsqcurvefit(). The key problem here is that I do not know how lsqcurvefit() calculate the sum-of-square. From the documentation,
fun_ND = @... % vectorized version
sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)
If the size of ydata of 1001*8, what the size of sum-of-squares would be? 1001*1 (sum along each row)or 1*8(sum along each column). Apparently, for curve-fitting, we want it to be 1001*1, i.e. the sum-of-squares should be calculated within each measurement.
For a matrix, sum() would sum along column if there is not a second input provided. I am not sure if sum((fun(x,xdata)-ydata).^2) in the documentaion is pseudo-code or not.
  1 Comment
Xingwang Yong
Xingwang Yong on 14 Jan 2021
I find out myself, one can assume that sum() in lsqcurvefit() would add along row, e.g.
sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)
if ydata is m-by-n, then sum-of-squares should be m-by-1.
Here is the code, althought it can not prove that sum() add along row directly.
clear
numMeasure = 101;
numPoints = 8;
a = rand(numMeasure,1);
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
fun_ND = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
x0 = [rand(numMeasure,1),rand(numMeasure, 1)];
abfit = lsqcurvefit(fun_ND,x0,all_xdata,all_ydata);
numDisplay = 5;
yfit = fun_ND(abfit(1:numDisplay,:), all_xdata(1:numDisplay,:));
figure;
plot(all_xdata(1:numDisplay,:)', all_ydata(1:numDisplay,:)', 'ok', ...
all_xdata(1:numDisplay,:)', yfit', '-');
Maybe you treat sum() add along column and write another version of fun_ND(), it would also work.

Sign in to comment.

Accepted Answer

Matt J
Matt J on 14 Jan 2021
Edited: Matt J on 14 Jan 2021
if ydata is m-by-n, then sum-of-squares should be m-by-1.
No.
The objective function minimized by lsqcurvefit is always just,
sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2 ,'all')
In other words, the shape of your data matrices is irrelevant to the calculation of the objective function. However, in your case, because each row of your matrices happens to have its own independent set of parameters (a,b), the minimization is separable, i.e., the optimum (a,b) only depends on the sum-of-square terms from its corresponding row. But this has nothing to do with lsqcurvefit. It is just a feature of the parametric structure of your specific problem. If you had organized your data and parametric model column-wise instead of row-wise, however, you would get the exact same result.
Another important thing to understand with lsqcurvefit is that it does not work with your xdata, ydata, or fun_ND output in 1001x8 matrix form. It always does a preliminary reshape of all those matrices into 8008x1 column vectors (see also here). Similarly, when you supply your own sparse Jacobian calculation (which you must do to get the advantages of this technique) lsqcurvefit will expect to receive an 8008x2002 matrix.
As one consequence of this, I think it will be better for you to organize your xdata and ydata as 8x1001 matrices instead of 1001x8 (and change fun_ND accordingly). That way, your Jacobians will have a simple, sparse, block diagonal form, where each of the diagonal blocks is 8x2.
  8 Comments
Xingwang Yong
Xingwang Yong on 25 May 2021
Edited: Xingwang Yong on 25 May 2021
Matt, you are right. I add bound constraints as John D'Errico suggested in his optimization tips (Chapter 17), then all 3 versions converged and match with each other.
clear
N = 100;
counter=0;
for iter = 1:N
%%
disp([num2str(iter), ':']);
numMeasure = 101; % number of curves, for speed, change 1001 to 101
numPoints = 8; % number of data points on each curve
numParams = 2; % number of unknown parameters
a = rand(numMeasure,1);
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
lb = [0,0];
ub = [1,1];
LB = repmat(lb,numMeasure,1);
UB = repmat(ub,numMeasure,1);
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
'StepTolerance',1e-12,'FunctionTolerance',1e-12);
abfit_1D = zeros(numMeasure, 2);
exit_flags_1D = zeros(numMeasure, 1);
output_1D = cell(numMeasure, 1);
opt1 = opt; % jacob pattern for row-wise
opt1.JacobPattern = repmat(speye(numMeasure), numPoints, numParams);
opt2 = opt; % jacob pattern for col-wise
tmpCell = cell(numMeasure,numMeasure);
tmpCell(:) = {sparse(zeros(numPoints,numParams))};
for tmpIter = 1:numMeasure
tmpCell{tmpIter,tmpIter} = sparse(ones(numPoints, numParams));
end
jp = cell2mat(tmpCell);
opt2.JacobPattern = jp;
%%
% 1D
tic
for k = 1:numMeasure
[abfit_1D(k, :),~,~,exit_flags_1D(k),output_1D{k}] = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),lb,ub,opt);
end
toc
% row wise
tic
[abfit_rowwise,~,~,exit_flag1,output1] = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,LB,UB,opt1);
toc
% col wise
tic
abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2).';
toc
[~,~,~,exit_flag2,output2] = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2);
% Error analysis
percentError1=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError2=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError3=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100;
if any([percentError1,percentError2]>100)
counter = counter + 1;
end
end
counter

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!