Problem in plotting confidence interval in a probability plot

12 views (last 30 days)
i have performed extreme distribution to a data set.I intended to draw 95% confidence interval in the probability plot.The following codding was used:
WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ]; % observed water level data pd = fitdist(WL','ev'); % Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution. ci = paramci(pd); % This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent probplot(WL); h=probplot(gca,@(WL,x,y)evcdf(WL,x,y),[ci(1,1),ci(1,2)]); set(h,'color','r','linestyle','-') t= probplot(gca,@(WL,x,y)evcdf(WL,x,y),[ci(2,1),ci(2,2)]); set(t,'color','g','linestyle','-.')
But there is a problem in confidence interval line(green one) which deviated in a wrong way.there might be some error in ci(confidence interval) boundary but i could not fix it.please help me out.

Accepted Answer

José-Luis
José-Luis on 6 May 2013
Edited: José-Luis on 7 May 2013
I thing there are two problems with your approach. One is that you could simplify your code like this:
WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ];
% observed water level data
pd = fitdist(WL','ev');
% Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution.
ci = paramci(pd);
% This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent
probplot('extreme value' , WL);
h=probplot(gca,@evcdf,[ci(1,1),ci(1,2)]);
set(h,'color','r','linestyle','-')
t= probplot(gca,@evcdf,[ci(2,1),ci(2,2)]);
set(t,'color','g','linestyle','-.')
Note that probplot is by default a normal distribution. You should specify whenever you use a different distribution. Secondly, what makes you think that there is an error with what you did? This is not curve fitting, you are comparing cdfs. There is no reason why the parameters you obtained from ci will give you two cdf's "above" and "below" the best fit cdf. You probably expect that from vanilla curve fitting. But these are not the "curves" you fitted to, these are cdf's.
Please take a look at the cdf's, as they might help you understand better:
WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ];
% observed water level data
pd = fitdist(WL','ev');
% Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution.
ci = paramci(pd);
% This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent
evBest = @(x) evcdf(x,pd.Params(1),pd.Params(2));
evLow = @(x) evcdf(x,ci(1,1),ci(1,2));
evHigh = @(x) evcdf(x,ci(2,1),ci(2,2));
fplot(evBest,[0 10],'k--');
hold on
fplot(evLow,[0 10],'r-');
fplot(evHigh,[0 10],'b-');
  2 Comments
Tom Lane
Tom Lane on 10 May 2013
This approach, while a good idea, requires that the lower and upper limits of the cdf be calculated from the paired lower and upper limits of the parameters. I don't believe that would work in general. It might take some figuring out to determine how to convert the parameter bounds into cdf bounds.

Sign in to comment.

More Answers (1)

Tom Lane
Tom Lane on 10 May 2013
You are right that probplot has no feature for including confidence bounds.
The cdf method for the probability distribution object, I believe, can return lower and upper bounds as its second and third outputs for some distributions. It appears not to be documented, but is visible in generated code from dfittool so I would expect it to be supported.
Try this:
x = evrnd(10,2,40,1);
p = fitdist(x,'ev')
xx = linspace(icdf(p,.001),icdf(p,.999));
n = length(x);
[f,flo,fhi] = cdf(p,xx);
plot(sort(x),icdf(p,(1:n)/(n+1)),'rx',xx,icdf(p,f),...
'b-',xx,icdf(p,flo),'r:',xx,icdf(p,fhi),'r:')
This code computes bounds on the cdf, then converts the cdf using the inverse cdf so the fit plots as a straight line. The resulting plot shows the observed points as red X's and the fitted quantiles as the blue line. This is basically the probability plot, except that the y axis is labeled with quantile values instead of probability values. Perhaps this will be adequate. You could try to label the y axis with the probability values if that is important.
  2 Comments
Tian
Tian on 12 Jun 2013
I just came across this answer, it works for standard probability functions. Do you know how to calculate the confidence interval for a user defined PDF and CDF? I was trying for a bi normal pdf. for example: fpdf=p*normpdf(mu1,sigma1)+(1-p)*normpdf(mu2,sigma2)
Tom Lane
Tom Lane on 12 Jun 2013
I don't think the toolbox has any support for compute confidence intervals for a user-defined cdf.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!