How do I group excel data by a keyword?

7 views (last 30 days)
In the excel sheet attached, there is a lithology column included (column D). I need to be able to group each row based on their lithology. (Muds group together, interbedded silt(stone) and mud(stone) group together etc.). It seems to be a realitivly easy task but I'm quite new to matlab and I could not find other solutions by searching. Any help is appriciated.

Accepted Answer

Voss
Voss on 26 Mar 2022
t = readtable('Alaska_1418.xlsx')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
t = 301×8 table
Hole Depth_mbsf_ MAJLith_Prefix MAJLith_PrincipalName BulkDensity_g_cm__ Porosity_v_v_ VoidRatio waterContent _____ ___________ __________________ __________________________________________ __________________ _____________ _________ ____________ {'A'} 0.74 {'diatom bearing'} {'mud' } 1.629 0.689 2.217 0.7648 {'A'} 2.25 {'diatom bearing'} {'mud' } 1.53 0.724 2.623 0.94034 {'A'} 4.16 {0×0 char } {'mud' } 1.792 0.577 1.367 0.49249 {'A'} 6.8 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 1.868 0.62 1.633 0.51502 {'A'} 8.27 {0×0 char } {'mud' } 1.874 0.619 1.624 0.51077 {'A'} 12.8 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 1.831 0.556 1.254 0.45177 {'A'} 16.7 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 1.885 0.528 1.118 0.40191 {'A'} 19.7 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 1.842 0.538 1.165 0.42694 {'A'} 22.71 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 2.141 0.289 0.406 0.16019 {'A'} 26.22 {0×0 char } {'mud' } 1.877 0.526 1.108 0.40208 {'A'} 28.83 {0×0 char } {'mud' } 1.858 0.539 1.169 0.42249 {'A'} 32 {0×0 char } {'mud' } 1.923 0.506 1.023 0.36854 {'A'} 35.74 {0×0 char } {'sand' } 2.102 0.57 1.328 0.38485 {'A'} 38.93 {0×0 char } {'mud' } 2.031 0.554 1.243 0.38767 {'A'} 41.3 {0×0 char } {'mud' } 1.959 0.571 1.329 0.4252 {'A'} 45.13 {0×0 char } {'interbedded silt(stone) and mud(stone)'} 1.949 0.486 0.945 0.34274
[G,group_ID] = findgroups(t{:,4})
G = 301×1
6 6 6 5 6 5 5 5 5 6
group_ID = 7×1 cell array
{'Mud' } {'clast-poor diamict' } {'diatom ooze' } {'interbedded mud(stone) and diamict' } {'interbedded silt(stone) and mud(stone)'} {'mud' } {'sand' }
% make a cell array of tables, one for each group:
n_groups = numel(group_ID);
new_t = cell(1,n_groups);
for ii = 1:n_groups
new_t{ii} = t(G == ii,:);
end
% look at the table for group 1:
new_t{1}
ans = 3×8 table
Hole Depth_mbsf_ MAJLith_Prefix MAJLith_PrincipalName BulkDensity_g_cm__ Porosity_v_v_ VoidRatio waterContent _____ ___________ ______________ _____________________ __________________ _____________ _________ ____________ {'C'} 210.6 {0×0 char} {'Mud'} 2.016 0.44 0.787 0.28818 {'C'} 213.6 {0×0 char} {'Mud'} 1.961 0.465 0.869 0.32043 {'C'} 216.6 {0×0 char} {'Mud'} 2.019 0.449 0.816 0.29518
% look at the table for group 2:
new_t{2}
ans = 14×8 table
Hole Depth_mbsf_ MAJLith_Prefix MAJLith_PrincipalName BulkDensity_g_cm__ Porosity_v_v_ VoidRatio waterContent _____ ___________ ______________ ______________________ __________________ _____________ _________ ____________ {'D'} 267.63 {'muddy'} {'clast-poor diamict'} 2.038 0.416 0.714 0.26456 {'D'} 270.4 {'muddy'} {'clast-poor diamict'} 2.034 0.435 0.771 0.28053 {'D'} 288.58 {'muddy'} {'clast-poor diamict'} 1.997 0.435 0.771 0.28734 {'F'} 289.74 {'muddy'} {'clast-poor diamict'} 2.029 0.439 0.782 0.2847 {'F'} 292.36 {'muddy'} {'clast-poor diamict'} 2 0.465 0.869 0.31248 {'F'} 295.66 {'muddy'} {'clast-poor diamict'} 2.029 0.451 0.822 0.29481 {'F'} 299.8 {'muddy'} {'clast-poor diamict'} 1.975 0.448 0.813 0.3029 {'F'} 302.67 {'muddy'} {'clast-poor diamict'} 2.045 0.447 0.807 0.28795 {'F'} 305.15 {'muddy'} {'clast-poor diamict'} 2.022 0.448 0.811 0.29325 {'F'} 309.47 {'muddy'} {'clast-poor diamict'} 2.02 0.448 0.811 0.29372 {'F'} 386.85 {'muddy'} {'clast-poor diamict'} 1.942 0.427 0.747 0.29095 {'F'} 387 {'muddy'} {'clast-poor diamict'} 1.99 0.46 0.853 0.31029 {'F'} 393.07 {'muddy'} {'clast-poor diamict'} 2.042 0.44 0.785 0.28281 {'F'} 678.04 {'muddy'} {'clast-poor diamict'} 2.068 0.4 0.668 0.24726
  4 Comments
Jacob Allen
Jacob Allen on 27 Mar 2022
Edited: Jacob Allen on 27 Mar 2022
So, the code works fine when I input it but why do only 3 'Mud' rows appear when in the actual excel there are much more than 3? I would need all of them listed out in a new table. Also, is there a way I could create a seperate table for each when I just run the code once?
Voss
Voss on 27 Mar 2022
'Mud' and 'mud' are diferent is why. It may be easiest to change the three 'Mud' to 'mud' in the Excel file, to be consistent with the others.
This code does already create a separate table for each group. All tables are in the cell array new_t

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!