Redistribution of histogram type data in specified bins

7 views (last 30 days)
I am trying to redistribute data with a step of 4 into a new binning of step 3, as pictured below with input X = [4,16,8] and desired output Y = [3,9,10,6]. What is the most efficient way to do so? Until now I have been using random drawings but it takes too long for my current needs.
  2 Comments
KALYAN ACHARJYA
KALYAN ACHARJYA on 25 Sep 2020
From X = [4,16,8] to desired output Y = [3,9,10,6], is there any redistibution logic? The third term might be 12??

Sign in to comment.

Answers (2)

Rik
Rik on 25 Sep 2020
You should be really careful with this resampling, especially for so few samples.
Since you're assuming a flat distribution in each bar, why not treat you histogram as a probability distribution function? Then you can use the area under the curve to calculate the new heights.
  2 Comments
Arnaud Samie
Arnaud Samie on 25 Sep 2020
My "real" data has a lot more bins, and the desired new binning only slightly differs from the original, so it should be quite "safe". And it is indeed a probability distribution, so I will go with your solution. I assume you are thinking about using interp1?
Rik
Rik on 25 Sep 2020
Yes, that is in broad strokes the idea. See the code below for a rough sketch. You can probably do a lot to optimize.
X=[1 4 2];X=X/sum(X);
N=numel(X);
x_center=get_bin_pos(N);
xx=linspace(0,1,1000);
yy=interp1(x_center,X,xx,'nearest','extrap');
figure(1),clf(1)
plot(xx,yy)
tmp=cumtrapz(xx,yy);
N=4;
[x_center,x_right]=get_bin_pos(N);
Y=zeros(1,N);
for n=1:N
x=x_right(n);
Y(n)=tmp(find(xx>=x,1,'first'));
if n>1
Y(n)=Y(n)-sum(Y(1:(n-1)));
end
end
Y=Y/sum(Y);
hold on
plot(x_center,Y,'*')
hold off
yy2=interp1(x_center,Y,xx,'nearest','extrap');
hold on
plot(xx,yy2)
hold off
axis([0 1 0 1])
function [x_center,x_right]=get_bin_pos(N)
x=linspace(0,1,2*N+1);
x_center=x(2:2:end);%use center to interpolate the histogram
x_right=x(3:2:end);%integrate up to right edge to find bin count
end

Sign in to comment.


Steven Lord
Steven Lord on 25 Sep 2020
>> x = randi(12, 1, 1000);
>> h = histogram(x, 0:4:12);
% Look at the histogram before running the next line of code
>> h.BinEdges = 0:3:12;

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!