Histogram to a CDF/PDF

Sclay748 on 24 Aug 2020
Commented: Sclay748 on 24 Aug 2020
Hello, This is a screenshot of a table I have constructed for work.
Just to play it safe, I blacked out the column names, though it would be hard to assume anything with just 7 rows of the table to go off of. We will call the 5 fields "column1, column2, etc."
So I am able to create the hisogram of any of the columns, besides 3, but that isn't needed because it is all '94'.
I do:
histogram([a(1:135756).column1])
and the histogram works perfectly.
How would I do a CDF or PDF of this data?
I have tried:
histogram([a(1:135756).column1],'Normalization',pdf)
or
histogram([a(1:135756).column1],'Normalization',cdf)
but nothing changes from the original histogram.
Thank you!
Adam Danz on 24 Aug 2020

Bruno Luong on 24 Aug 2020
Edited: Bruno Luong on 24 Aug 2020
A=[a(1:135756).column1];
figure
subplot(2,1,1);
histogram(A,'Normalization','pdf');
ylabel('pdf');
subplot(2,1,2);
histogram(A,'Normalization','cdf');
ylabel('cdf');
Sclay748 on 24 Aug 2020
ahhhh beautiful.
Thank you!

Alan Stevens on 24 Aug 2020
You can get a CDF as follows:
% Modified Kaplan-Meier CDF
% assumes each point is representative of 1/N of the population.
a = sort(a(:,1)); % so all the data for a are sorted in ascending order
N = length(a);
for k = 1:N
CDF(k) = (k - 0.5)/N;
end
plot(a,CDF)
Because you have a large number of points you could simply numerically differentiate the CDF to get a PDF.
Bruno Luong on 24 Aug 2020
Hmm it cries for replacing the for-loop
a1 = sort([a(1:135756).column1]);
N = length(a1);
CDF = (0.5:N-0.5) / N;
plot(a1, CDF);
Sclay748 on 24 Aug 2020
Edited: Sclay748 on 24 Aug 2020
Bruno, that worked. Is it supposed to be a single curved line?
I thought it would still plot the bars, but arranged by CDF. Or that is atleast how my boss's turned out when he showed me an example using histogram(......,'Normalization',cdf)

