How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

Question

Xingwang Yong on 14 Jan 2021

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/716698-how-does-vectorized-lsqcurvefit-calculate-sum-of-square-row-wise-or-column-wise

Edited: Xingwang Yong on 25 May 2021

Accepted Answer: Matt J

Open in MATLAB Online

I have a 1001 measuments, in each measurement, there is a curve

y = a*exp(-b*x)

where x and y are vectors containing 8 elements, a and b are scalar parameters I want to fit. The 1001 measurements are independent and I want to get 1001 a and b.

Generally, I would do

% load measured data
% ...
% size(all_xdata) = 1001*8;
% size(all_ydata) = 1001*8;
fun1D = @(x, xdata)x(1)*exp(-x(2)*xdata) % y = a*exp(-b*x)
parfor k = 1:1001
    lsqcurvefit(fun1D,x0,all_xdata(k, :), all_ydata(k, :))
end

I learned from here vectorized lsqcurvefit() may give me some speed-up.

However, I am a little confused about how to set the shape of xdata and ydata with vectorized lsqcurvefit(). The key problem here is that I do not know how lsqcurvefit() calculate the sum-of-square. From the documentation,

fun_ND = @... % vectorized version
sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)

If the size of ydata of 1001*8, what the size of sum-of-squares would be? 1001*1 (sum along each row)or 1*8(sum along each column). Apparently, for curve-fitting, we want it to be 1001*1, i.e. the sum-of-squares should be calculated within each measurement.

For a matrix, sum() would sum along column if there is not a second input provided. I am not sure if sum((fun(x,xdata)-ydata).^2) in the documentaion is pseudo-code or not.

1 Comment
Show -1 older commentsHide -1 older comments

Xingwang Yong on 14 Jan 2021

Open in MATLAB Online

I find out myself, one can assume that sum() in lsqcurvefit() would add along row, e.g.

sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)

if ydata is m-by-n, then sum-of-squares should be m-by-1.

Here is the code, althought it can not prove that sum() add along row directly.

clear
numMeasure = 101;
numPoints = 8;
a = rand(numMeasure,1);
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
fun_ND = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
x0 = [rand(numMeasure,1),rand(numMeasure, 1)];
abfit = lsqcurvefit(fun_ND,x0,all_xdata,all_ydata);
numDisplay = 5;
yfit = fun_ND(abfit(1:numDisplay,:), all_xdata(1:numDisplay,:));
figure;
plot(all_xdata(1:numDisplay,:)', all_ydata(1:numDisplay,:)', 'ok', ...
     all_xdata(1:numDisplay,:)', yfit', '-');

Maybe you treat sum() add along column and write another version of fun_ND(), it would also work.

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 14 Jan 2021

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/716698-how-does-vectorized-lsqcurvefit-calculate-sum-of-square-row-wise-or-column-wise#answer_597748

Edited: Matt J on 14 Jan 2021

Open in MATLAB Online

if ydata is m-by-n, then sum-of-squares should be m-by-1.

No.

The objective function minimized by lsqcurvefit is always just,

sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2 ,'all')

In other words, the shape of your data matrices is irrelevant to the calculation of the objective function. However, in your case, because each row of your matrices happens to have its own independent set of parameters (a,b), the minimization is separable, i.e., the optimum (a,b) only depends on the sum-of-square terms from its corresponding row. But this has nothing to do with lsqcurvefit. It is just a feature of the parametric structure of your specific problem. If you had organized your data and parametric model column-wise instead of row-wise, however, you would get the exact same result.

Another important thing to understand with lsqcurvefit is that it does not work with your xdata, ydata, or fun_ND output in 1001x8 matrix form. It always does a preliminary reshape of all those matrices into 8008x1 column vectors (see also here). Similarly, when you supply your own sparse Jacobian calculation (which you must do to get the advantages of this technique) lsqcurvefit will expect to receive an 8008x2002 matrix.

As one consequence of this, I think it will be better for you to organize your xdata and ydata as 8x1001 matrices instead of 1001x8 (and change fun_ND accordingly). That way, your Jacobians will have a simple, sparse, block diagonal form, where each of the diagonal blocks is 8x2.

8 Comments
Show 6 older commentsHide 6 older comments

Xingwang Yong on 14 Jan 2021

Open in MATLAB Online

Thanks, Matt. If you are right, saying sum-of-squares is just a scalar in the vectorized version, then I'll doubt that vectorized ND version would not give the same result as the 1D version.

"the minimization is separable, i.e., the optimum (a,b) only depends on the sum-of-square terms from its corresponding row". I can not agree with this. If sum-of-squares is just a scalar, then (a1,b1) will be influenced from data from 2nd measurement, 3rd measurement... As a result, ND version will give different results than 1D version, i.e. N 1D fitting does not equal to 1 ND fitting.

Here is the code demonstrating this,

clear
numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_ND = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
opt = optimoptions('lsqcurvefit', 'Display', 'off');
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_ND = lsqcurvefit(fun_ND,x0ND,all_xdata,all_ydata,[],[],opt);
isequal(abfit_1D, abfit_ND)

Although the ND version is definitely seperable analytically, the code above indicates that it is not seperable numerically.

Matt J on 14 Jan 2021

Edited: Matt J on 14 Jan 2021

Open in MATLAB Online

Although the ND version is definitely seperable analytically, the code above indicates that it is not seperable numerically.

Using isequal is not a meaningful test. You should be looking at percent differences. Below, I test all three different methods that we've discussed and find they differ by negligible floating point discrepancies:

numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
                   'StepTolerance',1e-12,'FunctionTolerance',1e-12);
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_rowwise = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,[],[],opt);
abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',[],[],opt).';
%%Error analysis
percentError=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100
percentError = 7.1245e-08
percentError=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100
percentError = 7.1245e-08
percentError=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100
percentError = 2.2300e-14

Xingwang Yong on 24 May 2021

Open in MATLAB Online

Hi, Matt, I tried the JacobPattern option, which does not give me some speedup. More surprisingly, the vectorized version is much different from that of 1D version. So I rechecked the code you provided above, it turns out the vectorized version is not always same as that of 1D version. I tried 100 times, and 33 out of them did not match.

clear
N = 100;
counter=0;
for iter = 1:N
numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
                   'StepTolerance',1e-12,'FunctionTolerance',1e-12);
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_rowwise = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,[],[],opt);
abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',[],[],opt).';
%%Error analysis
percentError1=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError2=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError3=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100;
if any([percentError1,percentError2]>100)
    counter = counter + 1;
end
end

Matt J on 24 May 2021

I tried 100 times, and 33 out of them did not match.

You have to check the exitflag and make sure that all 3 versions converged. Cases where convergence did not occur should not be counted.

Xingwang Yong on 25 May 2021

Edited: Xingwang Yong on 25 May 2021

Open in MATLAB Online

Matt, you are right. I add bound constraints as John D'Errico suggested in his optimization tips (Chapter 17), then all 3 versions converged and match with each other.

clear
N = 100;
counter=0;
for iter = 1:N
    %% 
    disp([num2str(iter), ':']);
    numMeasure = 101; % number of curves, for speed, change 1001 to 101
    numPoints = 8; % number of data points on each curve
    numParams = 2; % number of unknown parameters
    
    a = rand(numMeasure,1);
    b = rand(numMeasure, 1);
    all_xdata = ones(numMeasure, 1) * (1:numPoints);
    all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
    
    x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
    lb = [0,0];
    ub = [1,1];
    LB = repmat(lb,numMeasure,1);
    UB = repmat(ub,numMeasure,1);
    fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
    fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
    fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
    
    opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
        'StepTolerance',1e-12,'FunctionTolerance',1e-12);
    abfit_1D = zeros(numMeasure, 2);
    exit_flags_1D = zeros(numMeasure, 1);
    output_1D = cell(numMeasure, 1);
    
    opt1 = opt; % jacob pattern for row-wise
    opt1.JacobPattern = repmat(speye(numMeasure), numPoints, numParams);
    
    opt2 = opt; % jacob pattern for col-wise
    tmpCell = cell(numMeasure,numMeasure);
    tmpCell(:) = {sparse(zeros(numPoints,numParams))};
    for tmpIter = 1:numMeasure
        tmpCell{tmpIter,tmpIter} = sparse(ones(numPoints, numParams));
    end
    jp = cell2mat(tmpCell);
    opt2.JacobPattern = jp;
    
    %%
    % 1D
    tic
    for k = 1:numMeasure
        [abfit_1D(k, :),~,~,exit_flags_1D(k),output_1D{k}] = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),lb,ub,opt);
    end
    toc
    
    % row wise
    tic
    [abfit_rowwise,~,~,exit_flag1,output1] = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,LB,UB,opt1);
    toc
    % col wise
    tic
    abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2).';
    toc
    [~,~,~,exit_flag2,output2] = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2);
    
    % Error analysis
    percentError1=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
    percentError2=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
    percentError3=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100;
    
    if any([percentError1,percentError2]>100)
        counter = counter + 1;
    end
end
counter

Sign in to comment.

How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

1 Comment
Show -1 older commentsHide -1 older comments

Accepted Answer

8 Comments
Show 6 older commentsHide 6 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

1 Comment Show -1 older commentsHide -1 older comments

Accepted Answer

8 Comments Show 6 older commentsHide 6 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

8 Comments
Show 6 older commentsHide 6 older comments