How can I add individual datapoints and connecting lines to grouped boxchart?

10 views (last 30 days)
Hi,
I'm trying to make a boxchart with grouped data. There are two sessions and two groups, so the chart needs to have two boxcharts per session (one for group A and one for group B). So in total there are 4 boxcharts. This is what I did:
opts = detectImportOptions('Data_example.xlsx');
opts.SelectedVariableNames = [1 2 3]; % specifiy which columns
[Group, Session, Value] = readvars('Data_example', opts);
preview ("Data_example.xlsx",opts)
ans = 8×3 table
Group Session Value _____ _________ _________ {'A'} {'test1'} -0.673 {'A'} {'test1'} -0.83626 {'A'} {'test1'} -0.048476 {'A'} {'test1'} 0.0793 {'B'} {'test1'} 0.16921 {'B'} {'test1'} 0.0249 {'B'} {'test1'} 0.059 {'B'} {'test1'} 0.13982
% Prepare variables
Session = categorical (Session);
Group = categorical (Group);
% Make boxplot
figure;
boxchart (Session, Value, 'GroupByColor', Group);
legend
xlabel ('Session');
ylabel ('Value');
So far, soo good.
Now I would like to add the individual data points to each boxchart and connect the corresponding points with a straight line. This is my attempt, but it's not working:
figure;
boxchart (Session, Value, 'GroupByColor', Group);
legend;
xlabel ('Session');
ylabel ('Value');
hold on
plot (Session, Value, 'ko', 'LineWidth', 2);
Obviously this is not what I was looking for since there are only two columns of data points but I'd like to show them superimposed on each of the boxcharts, if that makes sense.
Does anybody know how I could do that?
Many thanks,
Dobs
  6 Comments
dpb
dpb on 28 Sep 2022
"Starting in R2022a, the objects with numeric values retain their original class."
That needs to be made EXTREMELY clear in the doc (I, at least, couldn't find it even mentioned in passing by any searching I tried and there need to be some of these Q?/Answers more than trivial uses moved into the doc examples. It's been a real beef of mine "since forever" that so many examples are nothing but barest trivial use of a given parameter, there's nothing whatever to be learned in those.
"...shows scatter using a grouping variable?"
My use of it is in <Answer Link> but I learned the trick from @Chunru's previously-posted answer to the same Q? -- I simply copied (stole?) that part of his and extended the graphics to be more closely aligned with the desires of the original poster. But, unless I'm missing something, I can't find that use/syntax for scatter documented anywhere.

Sign in to comment.

Answers (1)

dpb
dpb on 27 Sep 2022
Edited: dpb on 28 Sep 2022
fn=websave('Data_example.xlsx','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1136410/Data_example.xlsx');
tD=readtable(fn);
tD=convertvars(tD,@iscellstr,'categorical');
hBX=boxchart(tD.Session,tD.Value,'GroupByColor',tD.Group);
drawnow % have to force update online for data to become visible
hNC=hBX(1).NodeChildren; % some of the goodies under the hood that are visible reside here
%get(hNC(5))
X=mean(hNC(5).VertexData(1,1:2)); % mean X location first box
%V
hold on
hSC=scatter(X,tD.Value(tD.Group=='A'&tD.Session=='test1'),[],'k');
Puts the data points on the box chart box where they belong -- it's a lot more work(*) than it ought to be because TMW didn't see fit to return the coordinates of the grouped boxes, but one can revert to the old kludges had to use in the olden days with the bar function to find the underlying data and compute where they are.
The above past the generation of the chart is specific for the first box -- there is a handle array for each group and then another array of object handles for each box.
I don't have past R2020b installed yet, so trying to debug interactively here is a pain and it takes past it to be able to draw a numeric value onto the categorical axes (not sure yet which release introduced that feature) so the job left is to create the for..end loop structure to iterate over the box handles and groups to index into the proper arrays. With some study, much of that can probably be vectorized; the VertexData is an 3x8 (xyz vertically; z -->0) set of vertices for the two groups; if there were three groups then it would be 3x12, so one can compute which indices are the locations need. This computes the position from the actual vetrices instead of trying to compute what ratio is used internally from the input data sizes...for simple cases, that's probably not too hard to figure out that could be simpler coding but perhaps less general, who knows what lurks underneath?
(*) Of course, it doesn't look like all that much work once it's done; I think that's part of the problem from the TMW side in their lack of leaving things visible; the folks who already know the internals of the pertinent objects can address the pieces knowing they're there and where; the end user (even those of us with a lot of "time in grade)" have to go "handle diving" and use tools like Yair Altman's spelunking tool to find the hidden properties where the necessary information may be hidden. That didn't used to be so bad; everything was built around a base axis object and one could (eventually) virtually always find the handle to it and then all the children fell out in a row. Now, however, they've fallen in love with the idea of these composite "specialty chart objects" and the axes are totally opaque -- you simply can't find it anymore, like in this case. It's easy to complete an analysis in about 15 minutes and then waste two weeks trying to get a desired presentation format.
%legend
%xlabel ('Session');
%ylabel ('Value');

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!