# how do I determine the probability distribution of data?

77 views (last 30 days)
Doug on 29 Mar 2012
Commented: mechE on 6 Apr 2018
Hello, I have a data set and I am trying to determine its probability distribution. It is from empirical data and I have no idea what distribution family it would have, let alone what parameters it would have. Is there a matlab function that can do that?

Richard Willey on 29 Mar 2012
Sorry if this sounds like a silly question:
Is there an absolute requirement that you describe your data using a parametric distribution? If so why?
As an alternative, would something like the following suffice?
%%Generate some data
X1 = 10 + 5 * randn(200, 1);
X2 = 20 + 8 * randn(250 ,1);
X = [X1; X2];
%%Fit a distribution using a kernel smoother
myFit = fitdist(X, 'kernel')
%%Visualize the resulting fit
index = linspace(min(X), max(X), 1000);
plot(index, pdf(myFit, index))
%%Generate a set of 500 random numbers drawn from the distribution
numbers = random(myFit, 500, 1);
numbers(1:10)
%%Inspect the complete set of methods for myFit
methods(myFit)
Tom Lane on 13 Apr 2012
His example produces a nonparametric density estimate that should be flexible enough to adapt to your data. It doesn't produce a named parametric distribution (normal, Weibull, etc.).

Doug on 29 Mar 2012
Not sure. I need to find the distribution of the sum of n independent identically distributed random variables with the same distribution. I haven't taken or used statistics in many years so I had tried to read up and found that different distributions sum random variables differently.
From my historam, it "looks" like a gamma distribution. Is there a relatively straightforward way to verify that?
Thanks very much.
mechE on 6 Apr 2018
If anyone has found the solution please mention here.