Is there a way to normalizing a probability density function / keep the density values between 0 and 1?

I am creating a distribution for my data set, so I’m creating a distribution object first then a pdf from the distribution object. I was wondering if there is a way keep my density values between 0 and 1. I understand that despite the density values the curve still integrates to 1 and its no issue but it'd be nice if there was a way to limit the range between 0 and 1.
I've considered dividing my density value by the max density value to get them to be between 0 and 1 but i am afraid that may have some ramifications.
Test_Data;
dist_object = fitdist(Test_Data,"Normal");
dist_pdf = pdf(dist_object,Test_Data);
Normal_hist = histfit(Test_Data,20,'normal');
Normal_hist(1).FaceColor = [.8 .8 1];
title('normal pdf');

 Accepted Answer

Dividing by the max density value would have the unfortunate ramification that the resulting function would no longer be a density function because it would no longer integrate to 1.
It is a common misconception that pdf values should be between 0 and 1 like probabilities. As you said, though, the actual requirement for a pdf is that it integrates to one.

More Answers (1)

You are fitting a normal distribution, which implies infinite extent.
In any case where the standard deviation is greater than 1/sqrt(2*pi) then no element of the pdf will be greater than 1 and no rescale is required. If the standard deviation is smaller than that then the pdf at the location of the mean will be greater than 1 and any rescale will prevent the integral from working. You would need to increase the standard deviation to prevent that.
However, I wonder whether your data is truly sampling an infinite distribution. I have not looked at the data, but when a set of data is involved it is far more common that the situation involves a finite span... and finite spans are incompatible with normal distribution. I would ask whether perhaps you should be looking for a beta distribution instead of a normal distribution.

4 Comments

Ohh this is very helpful. I am not strictly fitting a normal distribution, I put that there as an example. In fact, the data is non-parametric. The idea is trying to find a better fitting distribution to avoid the non-parametric pdf and it just so happens the distributions that seem to fit the data better has pdf values greater than 1. I was mostly curious than really needing a solution....but from what you said it sounds like i will have to make do with whatever pdf values i get or find a distribution with pdf values less than one which is fine. My question to you is, Is there any way to at least scale it so that it looks reasonable (i have to present the data and the pdf values may throw people off)….btw If you end up looking at the data please let me know if you have any suggestions and thank you..
I do not know much about constructing non-parametric models.
However, sometimes pdf greater than 1 is unavoidable, if most of the items are concentrated in a narrow band.
Consider a uniform continuous-in-the-limit distribution from 0 to 1, equal probability for each. Summation-approaches-integral, probability p times width of the area; let the width be 1, integral would be p*1, integral must be 1, it follows that the pdf p is 1.
Now use exactly the same distribution shape but make the width be 1/2. int(p, x, 0, 1/2) = p/2 = 1, so pdf = 2. It is unavoidable.
In any case where you have a finite span, you can end up with a pdf greater than 1 just by doing a linear transformation that compresses the axes.
I have noticed you editing your comment a couple of times, but I have not noticed any changing in the wording ?
"My question to you is, Is there any way to at least scale it so that it looks reasonable"
NO.
In some cases where you have most of your probability in small region compared to the overall span, then maybe it makes sense to truncate the distribution, chopping off the ends and multiplying the pdf by enough to raise the integral of the remaining range to 1 over the remaining range. But if you do that, then you increase the probabilities in that range, and if you already have pdf values greater than 1, then that would get you pdf values that were even more greater than 1.

Sign in to comment.

Asked:

on 17 Nov 2021

Commented:

on 29 Nov 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!