MATLAB Answers

Why is probplot not plotting the reference line for one matrix column?

7 views (last 30 days)
Daniel Bridges
Daniel Bridges on 26 Jan 2017
Edited: Daniel Bridges on 26 Jan 2017
Inputting a 1457x3 double matrix whose columns represent millimeter motion in 3D space, probplot is not plotting a reference line for the first column (lateral data). Why? (Removing this column, probplot works, so it appears limited to this data specifically.)
I have attached the DICOMallshifts data, the file created via csvwrite. Interestingly, WordPad is showing -0 whereas MATLAB simply has 0, (cf. screenshot below) ... Does this show that MATLAB is in fact treating the 0 as negative because of this line in my code redefining axes?
DICOMallshifts = cmtomm*...
[-InterfractionalElektaMotion(:,1) -InterfractionalElektaMotion(:,3) ...
Are all the -0 data causing this error in probplot?
Here is the code generating the probplot figure:
Here is a screenshot showing the problem:
Discrepancy between MATLAB and data written to file:


Sign in to comment.

Answers (1)

Massimo Zanetti
Massimo Zanetti on 26 Jan 2017
The problem is not due to -0 (which is the same as 0). It seems that your data in the first column are not even close to be Gaussian distributed. If you plot the histrogram of the first column you see it has just a peak in 0.


Daniel Bridges
Daniel Bridges on 26 Jan 2017
I appreciate your agreement that the data is not Gaussian (which I intend to show visually via probplot). I think probplot should still work to show how it deviates from a Gaussian distribution.
Looking at the histogram, perhaps you are correct: Is there not enough data to fill bins to generate the normal distribution reference line? (There is more non-zero data for the other two columns.) Would you recommend another tool if probplot isn't applicable?
Massimo Zanetti
Massimo Zanetti on 26 Jan 2017
My suspect is that, having such a large number of same data (precisely many 0s) the line cannot be drawn because its angular coefficient is close to infinite (a vertical line cannot be defined as a function).
Inspect this:
It throws no errors, but the line is not plotted.
Daniel Bridges
Daniel Bridges on 26 Jan 2017
To be fair, MathWorks says in the More About section of their plot documentation, "Use NaN and Inf values to create breaks in the lines."
In this case, the plot input does not cause an error because you are validly telling MATLAB to break the plot between each vector element: There is no plot because you told it not to plot, not because the slope is too steep.
Of course I agree with you that a multivalued "function" like a vertical line cannot be plotted if the program expects a one-to-one relation.
Thank you for your theory that the line is too steep for probplot to plot. I suspect there is not enough data apart from the peak to determine "the lower and upper quartiles" for the "reference line" to 'pass through', to quote probplot.m, that the problem is related to chi2gof: In the 'More About' section it describes the algorithm:
"chi2gof compares the value of the test statistic to a chi-square distribution with degrees of freedom equal to nbins - 1 - nparams, where nbins is the number of bins used for the data pooling and nparams is the number of estimated parameters used to determine the expected counts. If there are not enough degrees of freedom to conduct the test, chi2gof returns the p-value as NaN."
Indeed, for this problematic data set the p-value is reported NaN. So I think your comment is correct, and that this distribution does not have enough disparate data to test against a normal distribution.
I would still like confirmation and a recommendation for other applicable tools, however: I am seeking to model this data. Numerous authors of motion management assume normal distributions, and if we cannot do so for this motion ...

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!