How to fit a nonparametric distribution to a sample of known percentile values

5 views (last 30 days)
Hello everyone
I have a sample of percentile values that describe the distribution of possible earthquake acceleration levels that lead to the failure of a building component. I would like to fit a nonparametric model to these data. I know that, for a random sample of these earthquake acceleration levels, I coiuld fit a nonparametric density using the the ksdensity function but is there a way to do a similar fit for the cumulative distribution function of this function?
Many thanks
Example data:
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
  5 Comments
Torsten
Torsten on 14 Aug 2024
And the method doesn't allow to approximate only between acc(1) and acc(9) where 89 % of the mass is cumulated ?
Jeff Miller
Jeff Miller on 14 Aug 2024
@Torsten Not completely. The smoothing would spill over at the edges, so for example the pdf at prctile 91 would depend a bit on what you assumed about the top 8%.

Sign in to comment.

Answers (2)

Star Strider
Star Strider on 13 Aug 2024
The empirical cumulative distribution function ecdf would likely bea appropriate here. (There is also ecdf however it seems less applicable to me.) There are a number of associated functions as well, lilnked to in that documentation page.
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
figure
ecdf(acc, 'Frequency',percentiles)
grid
axis('padded')
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles)
f = 10x1
0 0.0067 0.0314 0.0919 0.1659 0.2825 0.4305 0.5987 0.7937 1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
x = 10x1
0.3339 0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
flo = 10x1
NaN 0 0.0152 0.0651 0.1314 0.2407 0.3845 0.5532 0.7562 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
fup = 10x1
NaN 0.0143 0.0476 0.1187 0.2004 0.3243 0.4764 0.6441 0.8313 NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
.
  8 Comments
Torsten
Torsten on 14 Aug 2024
Edited: Torsten on 14 Aug 2024
So you want a smooth version of
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
plot(acc,percentiles/100)
to get an approximate cdf ? Maybe fit a sigmoid function ?
Star Strider
Star Strider on 14 Aug 2024
The pdf plots might look something like this —
percentiles = [3 11 27 33 52 66 75 87 92];
acc = [0.3339 0.3595 0.4209 0.4283 0.4645 0.5010 0.5080 0.5713 0.6025];
[f,x,flo,fup] = ecdf(acc, 'Frequency',percentiles);
dfdx = gradient(f, x);
dpda = gradient(percentiles/100, acc);
figure
stairs(x, dfdx, 'DisplayName','From ‘ecdf’ Results')
hold on
stairs(acc, dpda, 'DisplayName','From Posted Vectors')
hold off
grid
xlabel('$x$', 'Interpreter','LaTeX')
ylabel('$\frac{dF(x)}{dx}$', 'Interpreter','LaTeX', 'FontSize',14)
legend('Location','best')
.

Sign in to comment.


Image Analyst
Image Analyst on 14 Aug 2024
You could fit a spline through them. The spline doesn't take any parameters, it just fits a cubic equation between each pair of points. See attached demo.
  1 Comment
Xavier
Xavier on 14 Aug 2024
Thanks for this idea, but fitting a spline does not ensure that the fitted function will comply with the necessary conditions for being a CDF.

Sign in to comment.

Categories

Find more on Interpolation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!