Info

This question is closed. Reopen it to edit or answer.

Filtered variable levels in a table reappearing after calling ANOVAN

1 view (last 30 days)
I'm seeing some funny behavior where items from a large table are getting passed on to the a smaller table, even when those data should be filtered out. In my data set "ShearBondData.mat" there are several levels under 'Project' and 'Task_AsphaltType'. I filter out most of these by picking only rows where 'Test1'=="x" and then running an ANOVA on the data subset. BUT, if I check the coefficient names in the stats output (t1_stats.coeffnames), I can see ALL the levels that were in the ShearBondData.mat file, as if they never got filtered out.
load ('ShearBondData.mat','-mat');
rows = ShearBondData.Test1 == "x";
t1_data = ShearBondData(rows,:);
y = t1_data.BondStrengthpsi;
g1 = t1_data.Project;
g2 = t1_data.AsphaltType;
names = {'Sample_source','AsphaltType'};
[t1_p,t1_tbl,t1_stats]= anovan(y,{g1,g2},'varnames',names);
t1_stats.coeffnames
%[t1_c,t1_m,t1_h,t1_gnames] = multcompare(t1_stats);
I tried to replicate the same thing below, but it seems to work fine. I'm filtering out only "Cool" vehicles, which excludes all vans. Then in the ANOVA output, the coeffnames shows only 'Car', 'Truck', 'Red', and 'White.' If it were behaving like my code above, you'd also see 'Van' listed. ANY IDEA WHAT'S GOING ON????
Vehicle = {'Car';'Car';'Car';'Car';...
'Truck';'Truck';'Truck';'Truck';...
'Van';'Van';'Van';'Van'};
IsCool = [1;1;1;1;1;1;1;1;0;0;0;0];
Color = {'Red';'Red';'White';'White';...
'Red';'Red';'White';'White';...
'Red';'Red';'White';'White'};
MPG = [25;35;29;30;15;11;12;10;13;9;20;15];
T = table(Vehicle,IsCool,Color,MPG);
rows = T.IsCool == 1;
t1 = T(rows,:);
y = t1.MPG;
g1 = t1.Vehicle;
g2 = t1.Color;
[p,tbl,stats]=anovan(y,{g1,g2},'varnames',{'Vehicle','Color'});
stats.coeffnames
  3 Comments
Bryan Wilson
Bryan Wilson on 29 May 2018
Yes, t1_data is definitely a subset of "ShearBondData.mat." In t1_data, there are 2 levels under the 'Sample_souce' field, and 6 levels under 'AsphaltType.' But the ANOVA output shows 5 levels and 12 levels under the same respective fields.
Bryan Wilson
Bryan Wilson on 30 May 2018
Edited: Bryan Wilson on 30 May 2018
%Convert from CATEGORICAL type to STRING type
g1 = string(t1_data.Project);
g2 = string(t1_data.AsphaltType);
This solves my problem but doesn't answer what was causing the odd behavior before. And I still can't replicate the problem with the dummy-data.

Answers (0)

This question is closed.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!