How do I generate a pdf from some known percentile values

Question

Christopher Stokes on 12 Oct 2021

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1562016-how-do-i-generate-a-pdf-from-some-known-percentile-values

Commented: Star Strider on 13 Oct 2021

Hi Matlab community,

I have percentile values that describe a distribution of possible sea level rise magnitudes. I would like to be able to generate a probability density function that closely approximates the actual distribution from which the percentile values were generated. Can anyone suggest how to achieve this please?

Example data:

prctiles = [5 10 30 33 50 67 70 90 95];

SLR = [3.2760 3.5265 4.1286 4.2013 4.5566 4.9151 4.9836 5.6045 5.9105];

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Star Strider on 12 Oct 2021

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1562016-how-do-i-generate-a-pdf-from-some-known-percentile-values#answer_806996

Open in MATLAB Online

A slightly different approach —

prctiles = [5 10 30 33 50 67 70 90 95];

SLR = [3.2760 3.5265 4.1286 4.2013 4.5566 4.9151 4.9836 5.6045 5.9105];

B = fminsearch(@(b) norm(prctiles/100 - cdf('Normal',SLR,b(1),b(2))), [SLR(prctiles==50);rand])

B = 2×1

4.5589 0.8095

SLRv = linspace(min(SLR), max(SLR));

yfit = cdf('Normal', SLRv, B(1), B(2));

figure

plot(SLR, prctiles/100, 'p')

hold on

plot(SLRv, yfit, '-r')

hold off

grid

title(sprintf('$p = N(%.2f, %.3f)$',B), 'Interpreter','latex')

legend('SLR','Fitted Noprmal Distribution', 'Location','NW')

Experiment to get different results.

.

6 Comments
Show 4 older commentsHide 4 older comments

Star Strider on 13 Oct 2021

Open in MATLAB Online

As always, my pleasure!

The only way I can think of is to test them against all appropriate distributions, and in my trials, the Generalised Extreme Value distirbution provided the best fit, and fitted the tails appropriately. (I added the fitnlm calls to get statistics on the parameters and the fit. Using only fitnlm would be appropriate. The fminsearch function can be more robust in estimating the parameters, since it doesn’t use graident-descent, so I usually start with it for well-posed optimisations such as this.)

SLR = [7.4280 8.2520 10.1210 10.3470 11.6500 13.0120 13.2370 16.3701 18.5500];

prctiles = [5 10 30 33 50 67 70 90 95];

B = fminsearch(@(b) norm(prctiles/100 - cdf('Generalized Extreme Value',SLR,b(1),b(2),b(3))), [0; rand; SLR(prctiles==50)])

B = 3×1

-0.0557 2.7225 10.6097

GEVmdl = fitnlm(SLR(:), prctiles(:)/100, @(b,x)cdf('Generalized Extreme Value',x,b(1),b(2),b(3)), B)

GEVmdl =

Nonlinear regression model: y ~ F(b,x) Estimated Coefficients: Estimate SE tStat pValue _________ ________ _______ __________ b1 -0.055765 0.023704 -2.3525 0.056861 b2 2.7225 0.043684 62.323 1.1473e-09 b3 10.61 0.027091 391.64 1.8705e-14 Number of observations: 9, Error degrees of freedom: 6 Root Mean Squared Error: 0.00689 R-Squared: 1, Adjusted R-Squared 1 F-statistic vs. zero model: 2.19e+04, p-value = 1.68e-12

SLRv = linspace(min(SLR), max(SLR));

yfit = cdf('Generalized Extreme Value', SLRv, B(1), B(2), B(3));

figure

plot(SLR, prctiles/100, 'p')

hold on

plot(SLRv, yfit, '-r')

hold off

grid

title(sprintf('$p = GEV(%.4f, %.3f, %.3f)$',B), 'Interpreter','latex')

legend('SLR','Fitted Generalized Extreme Value Distribution', 'Location','NW')

Testing this with the earlier data for comparison —

prctiles = [5 10 30 33 50 67 70 90 95];

SLR = [3.2760 3.5265 4.1286 4.2013 4.5566 4.9151 4.9836 5.6045 5.9105];

B = fminsearch(@(b) norm(prctiles/100 - cdf('Generalized Extreme Value',SLR,b(1),b(2),b(3))), [0; rand; SLR(prctiles==50)])

B = 3×1

-0.2704 0.7888 4.2785

GEVmdl = fitnlm(SLR, prctiles/100, @(b,x)cdf('Generalized Extreme Value',x,b(1),b(2),b(3)), B)

GEVmdl =

Nonlinear regression model: y ~ F(b,x) Estimated Coefficients: Estimate SE tStat pValue ________ _________ _______ __________ b1 -0.27049 0.0075964 -35.607 3.271e-08 b2 0.78886 0.0043945 179.51 2.0164e-12 b3 4.2785 0.0026176 1634.5 3.5397e-18 Number of observations: 9, Error degrees of freedom: 6 Root Mean Squared Error: 0.00235 R-Squared: 1, Adjusted R-Squared 1 F-statistic vs. zero model: 1.89e+05, p-value = 2.61e-15

SLRv = linspace(min(SLR), max(SLR));

yfit = cdf('Generalized Extreme Value', SLRv, B(1), B(2), B(3));

figure

plot(SLR, prctiles/100, 'p')

hold on

plot(SLRv, yfit, '-r')

hold off

grid

title(sprintf('$p = GEV(%.4f, %.3f, %.3f)$',B), 'Interpreter','latex')

legend('SLR','Fitted Generalized Extreme Value Distribution', 'Location','NW')

The Generalsed Extreme Value distribution would appear to work well for all the data, if the provided examples are representative of them.

.

Christopher Stokes on 13 Oct 2021

That's a great solution, thank you! Really appreciate your help and great to see how these optimisation functions can be applied.

Star Strider on 13 Oct 2021

As always, my pleasure!

It was an education for me as well!

.

Sign in to comment.

Answer 2

Image Analyst on 12 Oct 2021

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1562016-how-do-i-generate-a-pdf-from-some-known-percentile-values#answer_806891

Have you seen fitdist() in the Stats toolbox?

Of course you'd be better off with much more data.

1 Comment
Show -1 older commentsHide -1 older comments

Image Analyst on 12 Oct 2021

Edited: Image Analyst on 12 Oct 2021

Open in MATLAB Online

Here's an example:

clc;    % Clear the command window.
fprintf('Beginning to run %s.m ...\n', mfilename);
close all;  % Close all figures (except those of imtool.)
clear;  % Erase all existing variables. Or clearvars if you want.
workspace;  % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 17;
SLR = [3.2760    3.5265    4.1286    4.2013    4.5566    4.9151    4.9836    5.6045    5.9105];
% Plot data.
subplot(2, 1, 1);
bar(SLR)
grid on;
title('Original SLR Data', 'FontSize', fontSize);
xlabel('Index', 'FontSize', fontSize);
ylabel('SLR Value', 'FontSize', fontSize);
% Get distribution.
d = fitdist(SLR(:), 'Normal')
% Make curve, plot distribution.
% https://en.wikipedia.org/wiki/Normal_distribution
x = linspace(min(SLR), max(SLR), 1000);
amp = 1 / (d.sigma * sqrt(2*pi));
y = amp * exp(-(1/2) * ((x - d.mu) / d.sigma) .^ 2)
subplot(2, 1, 2);
plot(x, y, 'b-', 'LineWidth', 2);
grid on;
title('Estimated Distribution of SLR', 'FontSize', fontSize);
xlabel('SLR', 'FontSize', fontSize);
ylabel('PDF', 'FontSize', fontSize);

Pick the distribution that fits the theory of what the distribution should actually be. Hopefully you know this in advance. Actually you need to if you're going to model it. Otherwise just normalize your histogram and that is the actual PDF.

Sign in to comment.

Answer 3

Christopher Stokes on 12 Oct 2021

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1562016-how-do-i-generate-a-pdf-from-some-known-percentile-values#answer_806926

Hi, I have been using that function quite a bit recently and the excellent wrapper function allfitdist() from the file exchange. My problem is that they rely on relatively large data samples from which to build a PDF when what I have is a small number of summary statistics that describe the distribution (i.e. not data samples). I imagine there is a way to use fitdist to do what I need to do, but I can't envisage how this would work.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How do I generate a pdf from some known percentile values

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

6 Comments
Show 4 older commentsHide 4 older comments

More Answers (2)

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How do I generate a pdf from some known percentile values

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

6 Comments Show 4 older commentsHide 4 older comments

More Answers (2)

1 Comment Show -1 older commentsHide -1 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

6 Comments
Show 4 older commentsHide 4 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments