# Latin Hypercube sampling from distrete, non-uniform distribution

8 views (last 30 days)
David on 1 Jun 2012
I have a series of independent events that are assumed to indiviudally have poisson distributions.
I first calculate the number of events in an annual period by using the compound poisson first principles.
Then I need to sample from the data to see which events occur (the number of events ranges for any given year range from 0 to 5). USing brute force monte carlo, I get convergence in the tail characteristics but only after extreme numbers of sampling.
I would like to reduce the number of samples while preserving the representative distributions evn when sampling in years with mroe than 1 event. For loops do not preserve the full distributions outside of the loop.
I do not see any natural latin hyper cube functionality within the statsitics toolbox that uses discrete distributions and perserves distributions for multiple event years.

Tom Lane on 2 Jun 2012
I can't think of any good way to use the latin hypercube feature here. You can apply poissinv to a distribution inside the unit cube, but I suspect that will not help.
I don't completely understand your scenario. If your total event number is the sum of independent Poisson values, then the sum itself is also Poisson. Furthermore, I believe the individual values, conditional on the sum, have a multinomial distribution. Can you make use of that?
David on 2 Jun 2012
Thanks for responding.
Yes, the since the distribution is compound Poisson, it behaves (as a whole) like a poisson and I am already doing a monte carlo simulation base on the multinomial distribution of relative rates. That is not the issue.
In the hope of adding clarity:
The source discrete distribution provides information for the following:
a) # of events per year (simple poisson distribution based on the sum of the individula events' rates) and
b) event ID's (and financial impact) that occur in years with one or more events:
e.g. year 2 has one event - therefore one event is sampled from the source discrete distribution based on the relative poisson rates (simple enough)
year 5 has 3 events - therefore 3 events are sampled from the source discrete distribution, etc.
The issue is how to efficiently sample so that the sampling for all 2nd events is representative of the source distribution. Likewise for the all 3rd events and so on.
I can approach the convolution statistics only after very large monte carlo simulations which take a disproportaionate amount of time (brute force).
I can envision a workaround if I had a funciton that first divided a non- uniform, discrete distribution into 'n+1' parts and then sampled once within each of these parts ('n' samples).
The issue so far is
1) I can't find a function which samples from within a segment of defined percentiles (I did find one that samples at the exact percentile or midpoint - which isn't a sample really)
and
2) using this function on a discrete, non-uniform distribution.
If you have any more thoughts, that would be great. Just taking the time to type this out is helpful to improve clarification of the issue.

Tom Lane on 2 Jun 2012
This is probably not what you want, but if you explain further why it's not what you want, maybe we can come closer to a solution. To sample intervals of unequal length or probability:
edges = [0 .4 .5 .8 1]';
[~,b] = histc(rand(1000,1),edges);
or
v = mnrnd(1,diff(edges),1000);
b = v*(1:4)';
To sample within a specified range of probabilities:
d = diff(edges);
[b, edges(b)+d(b).*rand(size(b))]
Tom Lane on 5 Jun 2012
Can you not just sample, with replacement, from a multinomial distribution based on the separate Poisson rates? That may give you multiple cases of the same discrete event from the original distribution. But if the count for that event follows a Poisson distribution itself, I would have thought that was okay.