How to make sparsity pattern graph with grouping

Hi. I would like to create a graph looking like the following:
The rows correspond to periods (months), and the columns to a set of features/predictors/X's. The features have been assigned to a category, which corresponds to the different colors.
The data that I would like to plot something similar for, are in the following form (-assuming we have 30 periods, and 15 perdictors):
%Create a Sparse Matrix M:
T=30; N=15;
M=zeros(T,N);
for i=1:T
M(i,randsample(1:N, 6, false))=1;
end
TT = array2timetable(M,'RowTimes',dateshift(datetime('today'),'end','month',-T+1:0),'VariableNames',"x"+string(1:N))
TT = 30x15 timetable
Time x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 ___________ __ __ __ __ __ __ __ __ __ ___ ___ ___ ___ ___ ___ 30-Nov-2021 1 0 0 1 1 1 0 0 0 1 0 0 0 0 1 31-Dec-2021 0 0 1 1 0 0 0 1 1 0 0 1 1 0 0 31-Jan-2022 1 1 0 0 1 0 0 0 0 0 0 0 1 1 1 28-Feb-2022 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 31-Mar-2022 0 0 0 1 0 0 0 0 1 0 0 1 1 1 1 30-Apr-2022 1 0 1 0 1 1 0 0 1 1 0 0 0 0 0 31-May-2022 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 30-Jun-2022 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1 31-Jul-2022 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 31-Aug-2022 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 30-Sep-2022 1 0 0 0 1 0 0 1 0 0 0 1 0 1 1 31-Oct-2022 0 1 1 0 0 0 0 1 0 1 0 0 1 1 0 30-Nov-2022 0 1 1 0 1 0 1 1 0 0 1 0 0 0 0 31-Dec-2022 0 0 1 1 0 0 1 1 1 0 1 0 0 0 0 31-Jan-2023 0 1 0 0 1 0 1 1 0 1 0 0 0 0 1 28-Feb-2023 0 0 1 1 1 1 0 0 1 1 0 0 0 0 0
c = [repmat("Financial",[1,5]) repmat("Econ",[1,8]) repmat("Survey",[1,2])]';
c = c(randsample(1:N,N,false));
categoriez = array2table(c,'RowNames',"x"+string(1:N),"VariableNames","categ")
categoriez = 15x1 table
categ ___________ x1 "Econ" x2 "Survey" x3 "Econ" x4 "Financial" x5 "Econ" x6 "Survey" x7 "Financial" x8 "Econ" x9 "Econ" x10 "Econ" x11 "Financial" x12 "Financial" x13 "Financial" x14 "Econ" x15 "Econ"
I would like to create a graph like the one on the top, using the data in timetable TT and table categoriez. Any help would be appreciated!

 Accepted Answer

Using pcolor
This solution uses pcolor and converts the table of 0s and 1s into grouping values that are used to assign color to each cell.
OP's demo code
T=30; N=15;
M=zeros(T,N);
for i=1:T
M(i,randsample(1:N, 6, false))=1;
end
TT = array2timetable(M,'RowTimes',dateshift(datetime('today'),'end','month',-T+1:0),'VariableNames',"x"+string(1:N));
c = [repmat("Financial",[1,5]) repmat("Econ",[1,8]) repmat("Survey",[1,2])]';
c = c(randsample(1:N,N,false));
categoriez = array2table(c,'RowNames',"x"+string(1:N),"VariableNames","categ");
% Convert strings to categorical
categoriez.categ = categorical(categoriez.categ);
% Convert the categoricals to integers for grouping
catVals = double(categoriez.categ);
% Replace the 0s in TT with NaNs and the 1s with the grouping values.
m = TT{:,:};
m(m==0) = nan;
m = m.* catVals.';
m(:,end+1) = nan; % pad 1 column so last column of data is shown
m(end+1,:) = nan; % pad 1 row so last row of data is shown
% Plot the results.
h = pcolor(1:width(m),[TT.Properties.RowTimes;NaT],m);
% Set color
ncats = numel(unique(catVals));
colormap(jet(ncats)) % nx3 matrix where n = number of categories
% Show colorbar with centered ticks
cb = colorbar();
cb.Ticks = linspace(1,ncats,ncats+1) + (ncats-1)/(ncats*2);
cb.TickLabels = categories(categoriez.categ);
Using scatter
Another approach is to use scatter which, unlike line objects, lets you control the color of each marking by setting the cdata.
% Find the row and column coordinates of 1 values.
[row,col] = find(TT{:,:});
% Compute the color values (m)
% Convert strings to categorical
categoriez.categ = categorical(categoriez.categ);
% Convert the categoricals to integers for grouping
catVals = double(categoriez.categ);
% Replace the 0s in TT with NaNs and the 1s with the grouping values.
m = TT{:,:};
m(m==0) = nan;
m = m.* catVals.';
% plot the scatter with filled markers
scatter(col,TT.Properties.RowTimes(row),60,m(~isnan(m)),'filled')
% Set color
ncats = numel(unique(catVals));
colormap(jet(ncats)) % nx3 matrix where n = number of categories
% Show colorbar with centered ticks
cb = colorbar();
cb.Ticks = linspace(1,ncats,ncats+1) + (ncats-1)/(ncats*2);
cb.TickLabels = categories(categoriez.categ);

4 Comments

That's very good indeed. Thank you Adam!
In the graph I provided, they sort the X's wrt the categories which results in colors being grouped together. I can adjust the solution you gave by sorting the timetable and the table:
[~,idx] = sort(categoriez.categ)
categoriez = categoriez(idx,:)
TT = TT(:,idx)
However, it seems that the graph in the picture I provided is using a line-plot with markers, and then probably they set the line color to white, leaving only the markers. I'll wait a bit to see if someone can came up with that solution, which seems a bit more condensed. In reality, I have 280 predictors, 220 periods, and 12 catergories.
I recommend using scatter instead of line. With scatter, you can control the marker color for each marker. I updated my answer to show how that would look.
It now looks even better. Is there any alternative to the colorbar for showing the categories?
You could use a legend instead but you'll need to use a hack to create the legend objects.
Add this to your code following the scatter example. I did not test this, check that the order of colors and labels are correct.
colors = jet(ncats);
names = categories(categoriez.categ);
hold on
h = gobjects(1,ncats);
for i = 1:height(colors)
h(i) = plot(nan,nan,'o',...
'MarkerEdgeColor','none',...
'MarkerFaceColor',colors(i,:),...
'MarkerSize',8, ...
'DisplayName', names{i});
end
legend(h)

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2020b

Asked:

on 17 Apr 2024

Edited:

on 24 Apr 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!