sorting in linaire descending order

lets says i have a matrice ( 9x5)
i want just one value from each column .
in a way to have values forming descending line . (the selection the most close to form a line).
its ok if its not a perfect line , just most close .
thank you

Answers (2)

hello
I generated some dummy data and tried to fit a linear curve , then searched for the data points closest to the mean curve
those points are with the black diamond marker
hope it helps
clc
clearvars
% lets says i have a matrice ( 9x5)
% i want just one value from each column .
% in a way to have values forming descending line . (the selection the most close to form a line).
% its ok if its not a perfect line , just most close .
for ci = 1:5
data(:,ci) = 6-ci + randn(9,1);
end
x = 1:5;
y = mean(data,1);
% Fit a polynomial p of degree "degree" to the (x,y) data:
degree = 1;
p = polyfit(x,y,degree);
% Evaluate the fitted polynomial p and plot:
f = polyval(p,x);
eqn = poly_equation(flip(p)); % polynomial equation (string)
Rsquared = my_Rsquared_coeff(y,f); % correlation coefficient
% find data nearest to fit curve
for ci = 1:5
err = abs(data(:,ci) - f(ci));
[val(ci),ind(ci)] = min(err);
data_selected(ci) = data(ind(ci),ci);
end
figure(1);plot(x,data,'+',x,f,'-',x,data_selected,'dk', 'MarkerSize', 14)
title(eqn)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Rsquared = my_Rsquared_coeff(data,data_fit)
% R2 correlation coefficient computation
% The total sum of squares
sum_of_squares = sum((data-mean(data)).^2);
% The sum of squares of residuals, also called the residual sum of squares:
sum_of_squares_of_residuals = sum((data-data_fit).^2);
% definition of the coefficient of correlation is
Rsquared = 1 - sum_of_squares_of_residuals/sum_of_squares;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function eqn = poly_equation(a_hat)
eqn = " y = "+a_hat(1);
for i = 2:(length(a_hat))
if sign(a_hat(i))>0
str = " + ";
else
str = " ";
end
if i == 2
eqn = eqn+str+a_hat(i)+"*x";
else
eqn = eqn+str+a_hat(i)+"*x^"+(i-1)+" ";
end
end
eqn = eqn+" ";
end

4 Comments

yes , its what i am looking for. thank you.
can we do the same process if it is other than a line ? if we have certain function and we want the closest data?
hello
yes you can do it also whatever kind of fitting equations you want to use.
here I created some data in a parabolic way and I am fitting a second order polynomial and looking for nearest data (still the balck diamonds)
clc
clearvars
for ci = 1:5
data(:,ci) = 40-ci^2 + 3*randn(9,1);
end
x = 1:5;
y = mean(data,1);
% Fit a polynomial p of degree "degree" to the (x,y) data:
degree = 2;
p = polyfit(x,y,degree);
% Evaluate the fitted polynomial p and plot:
f = polyval(p,x);
eqn = poly_equation(flip(p)); % polynomial equation (string)
Rsquared = my_Rsquared_coeff(y,f); % correlation coefficient
% find data nearest to fit curve
for ci = 1:5
err = abs(data(:,ci) - f(ci));
[val(ci),ind(ci)] = min(err);
data_selected(ci) = data(ind(ci),ci);
end
figure(1);plot(x,data,'+',x,f,'-',x,data_selected,'dk', 'MarkerSize', 14)
title(eqn)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Rsquared = my_Rsquared_coeff(data,data_fit)
% R2 correlation coefficient computation
% The total sum of squares
sum_of_squares = sum((data-mean(data)).^2);
% The sum of squares of residuals, also called the residual sum of squares:
sum_of_squares_of_residuals = sum((data-data_fit).^2);
% definition of the coefficient of correlation is
Rsquared = 1 - sum_of_squares_of_residuals/sum_of_squares;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function eqn = poly_equation(a_hat)
eqn = " y = "+a_hat(1);
for i = 2:(length(a_hat))
if sign(a_hat(i))>0
str = " + ";
else
str = " ";
end
if i == 2
eqn = eqn+str+a_hat(i)+"*x";
else
eqn = eqn+str+a_hat(i)+"*x^"+(i-1)+" ";
end
end
eqn = eqn+" ";
end
This is not guaranteed to find the best fit. In a contrived example where there exists data that is a perfect fit (just far from the mean), this method will not pick that up.
If something like that is likely to exist you would need to write a fitting function where the cost function depended on the closest point in the row. You would need to vary your parameters over a grid to avoid local minima (use only the min and only the max to find the bounds).
If you need that I can have a look if I can write something for you.
yes , go ahead Rik

Sign in to comment.

james sinos
james sinos on 27 Oct 2021
thank you mathieu
actionaly my problem is more complicated ,
i set this picture to explain it
i want to exract from my data:
  • each 'x' is relative to only one 'y'
  • i want to slect only one value of x from each column to form a line ( the nearest values of x's that approach to an descending line).
-------> BUT in a condition that the y's values ( each y is related to an x) also form the most possibl descending shape .

7 Comments

hello James
do you have some data to share ?
tx
In the mean time I created this example with X and Y data
see if it works on your data
clc
clearvars
for ci = 1:5
dataX(:,ci) = (6-ci) + 2*randn(9,1);
dataY(:,ci) = 5*dataX(:,ci) + randn(9,1);
end
x = 1:5;
xm = mean(dataX,1);
ym = mean(dataY,1);
% Fit a polynomial p of degree "degree" to the (x,y) data:
degree = 1;
px = polyfit(x,xm,degree);
py = polyfit(x,ym,degree);
% Evaluate the fitted polynomial p and plot:
fx = polyval(px,x);
eqnX = poly_equation(flip(px)); % polynomial equation (string)
RsquaredX = my_Rsquared_coeff(xm,fx); % correlation coefficient
fy = polyval(py,x);
eqnY = poly_equation(flip(py)); % polynomial equation (string)
RsquaredY = my_Rsquared_coeff(ym,fy); % correlation coefficient
% find data nearest to fit curve
for ci = 1:5
err = (dataX(:,ci) - fx(ci)).^2 + (dataY(:,ci) - fy(ci)).^2; % euclidian distance (error) to fit function
[val(ci),ind(ci)] = min(err);
dataX_selected(ci) = dataX(ind(ci),ci);
dataY_selected(ci) = dataY(ind(ci),ci);
end
figure(1);
subplot(211),plot(x,dataX,'+',x,fx,'--',x,dataX_selected,'dk', 'MarkerSize', 20)
xlabel(eqnX)
ylabel('X data');
subplot(212),;plot(x,dataY,'+',x,fy,'--',x,dataY_selected,'dk', 'MarkerSize', 20)
xlabel(eqnX)
ylabel('Y data');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Rsquared = my_Rsquared_coeff(data,data_fit)
% R2 correlation coefficient computation
% The total sum of squares
sum_of_squares = sum((data-mean(data)).^2);
% The sum of squares of residuals, also called the residual sum of squares:
sum_of_squares_of_residuals = sum((data-data_fit).^2);
% definition of the coefficient of correlation is
Rsquared = 1 - sum_of_squares_of_residuals/sum_of_squares;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function eqn = poly_equation(a_hat)
eqn = " y = "+a_hat(1);
for i = 2:(length(a_hat))
if sign(a_hat(i))>0
str = " + ";
else
str = " ";
end
if i == 2
eqn = eqn+str+a_hat(i)+"*x";
else
eqn = eqn+str+a_hat(i)+"*x^"+(i-1)+" ";
end
end
eqn = eqn+" ";
end
tankx mathieu
i did not see that you used ' Rsquared ', why u did calculate it ??
hello
well , if you don't need it , you can remove that portion of the code
all the best
thanks mathieu
if i already have the equation of my line , and i want to have nearest data to eat , the process will be the same ??
hello James
yes , of course
hello
problem solved ?

Sign in to comment.

Categories

Products

Release

R2017b

Asked:

on 26 Oct 2021

Commented:

on 5 Nov 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!