Plot mean and standard deviations along with data on a bell curve

61 views (last 30 days)
I have columns of data, numbering approximately 120 rows. The data is 1 thru 5, representing survey data. I am working on analyzing the data columns. The column data also has some NaN. I can calculate the mean and standard deviations. However, I am attempting to plot the mean, standard deviations, along with the actual data on the bell curve. I found this code that at least plots the data. But I am not sure how to change the code to correctly represent my data on a bell curve. For instance, I don't think I need the randn function, given the amount of data I have. In short, I just want to plot my data, the mean, and standard deviations (to plus and minus 3 sigma) for all columns of data on the bell curve, similar to what this code produces.
x = .03*randn(10000,1)+.34;
[N,X] = hist(x,100);
hfig = figure;
bar(X,N)
hold on;
y = [0 1.2*max(N)];
center = mean(x);
std1 = std(x);
%center plot
plot([center center],y,'r-.')
%1 std
plot([center center]+std1,y,'g-.')
plot([center center]-std1,y,'g-.')
%2 std
plot([center center]+2*std1,y,'k-.')
plot([center center]-2*std1,y,'k-.')
  10 Comments
Sunshine
Sunshine on 19 Jun 2020
Thanks a bunch! This really worked well for me. I was also able to modify the code to include additional standard deviations.
I am curious. I found code and modified it to what you see below. My goal is to calculate the percentages of the data for column p1. So for instance, I want to determine how much of the data in a particular column are 5s, 4s, 3s, etc.
Code
numberOfBins = max(personality_cols.p1(:));
countsPercentage = 100 * hist(personality_cols.p1(:), numberOfBins) / numel(personality_cols.p1)
Answer
countsPercentage =
9.0909 4.1322 13.2231 33.0579 24.7934
The countsPercentage does not equal 100. countsPercentage's total = 84.3. Is the remaining percentage NaN values? How do I get the percentage of the NaN values to know that the other percentages are correct? How do I exclude NaN values from the countsPercentage values so that 100 percent is only looking at data (5s, 4s, 3s, etc)? How do I know which values are being represented by which set of data (for example, how do I know 9.0909 are 1s or are they 5s from the p1 column data)?
Image Analyst
Image Analyst on 19 Jun 2020
You can use the isnan() function along with sum() to compute the number of nans in a vector.
numNans = sum(isnan(yourVector));
percentNans = 100 * numNans / numel(yourVector);

Sign in to comment.

Answers (1)

Image Analyst
Image Analyst on 22 May 2020
Then if it's not normally distributed data, why do you want to fit a bell curve to it?
Did you try fitdist():
load hospital
x = hospital.Weight;
pd = fitdist(x,'Normal')
x_values = 50:1:250;
y = pdf(pd,x_values);
plot(x_values,y,'LineWidth',2)

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!