Latin Hypercube sampling from distrete, non-uniform distribution

11 views (last 30 days)
I have a series of independent events that are assumed to indiviudally have poisson distributions.
I first calculate the number of events in an annual period by using the compound poisson first principles.
Then I need to sample from the data to see which events occur (the number of events ranges for any given year range from 0 to 5). USing brute force monte carlo, I get convergence in the tail characteristics but only after extreme numbers of sampling.
I would like to reduce the number of samples while preserving the representative distributions evn when sampling in years with mroe than 1 event. For loops do not preserve the full distributions outside of the loop.
I do not see any natural latin hyper cube functionality within the statsitics toolbox that uses discrete distributions and perserves distributions for multiple event years.
Thansk in advance.

Answers (2)

Tom Lane
Tom Lane on 2 Jun 2012
I can't think of any good way to use the latin hypercube feature here. You can apply poissinv to a distribution inside the unit cube, but I suspect that will not help.
I don't completely understand your scenario. If your total event number is the sum of independent Poisson values, then the sum itself is also Poisson. Furthermore, I believe the individual values, conditional on the sum, have a multinomial distribution. Can you make use of that?
  1 Comment
David
David on 2 Jun 2012
Thanks for responding.
Yes, the since the distribution is compound Poisson, it behaves (as a whole) like a poisson and I am already doing a monte carlo simulation base on the multinomial distribution of relative rates. That is not the issue.
In the hope of adding clarity:
The source discrete distribution provides information for the following:
a) # of events per year (simple poisson distribution based on the sum of the individula events' rates) and
b) event ID's (and financial impact) that occur in years with one or more events:
e.g. year 2 has one event - therefore one event is sampled from the source discrete distribution based on the relative poisson rates (simple enough)
year 5 has 3 events - therefore 3 events are sampled from the source discrete distribution, etc.
The issue is how to efficiently sample so that the sampling for all 2nd events is representative of the source distribution. Likewise for the all 3rd events and so on.
I can approach the convolution statistics only after very large monte carlo simulations which take a disproportaionate amount of time (brute force).
I can envision a workaround if I had a funciton that first divided a non- uniform, discrete distribution into 'n+1' parts and then sampled once within each of these parts ('n' samples).
The issue so far is
1) I can't find a function which samples from within a segment of defined percentiles (I did find one that samples at the exact percentile or midpoint - which isn't a sample really)
and
2) using this function on a discrete, non-uniform distribution.
If you have any more thoughts, that would be great. Just taking the time to type this out is helpful to improve clarification of the issue.

Sign in to comment.


Tom Lane
Tom Lane on 2 Jun 2012
This is probably not what you want, but if you explain further why it's not what you want, maybe we can come closer to a solution. To sample intervals of unequal length or probability:
edges = [0 .4 .5 .8 1]';
[~,b] = histc(rand(1000,1),edges);
or
v = mnrnd(1,diff(edges),1000);
b = v*(1:4)';
To sample within a specified range of probabilities:
d = diff(edges);
[b, edges(b)+d(b).*rand(size(b))]
  6 Comments
David
David on 4 Jun 2012
Tom, my last comment crossed yours and only applies to the outstanding question.
As to your observation that discrete events might lie in more than one 'percentile bucket', that is precisely one of the issues I'm trying to resolve. I thought latin hypercube sampling divided the bucket into 'x' number of segments and then randomly sampled from within each of those segments. In my attempt to solve the 1st part, I've sample from each defined segment and then used that specific percentile to select the discrete events from the original distribution. Therefore an event that straddles a percentile could be selected from either percentile segment as in the case when n=10 in the solution above.
Tom Lane
Tom Lane on 5 Jun 2012
Can you not just sample, with replacement, from a multinomial distribution based on the separate Poisson rates? That may give you multiple cases of the same discrete event from the original distribution. But if the count for that event follows a Poisson distribution itself, I would have thought that was okay.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!