Data Analysis advance filtering
2 views (last 30 days)
Show older comments
I have table which has 200 variables. 4 different types. A1,A2...A50,B1,B2...B50,C1,C2....C50,D1,D2.....D50. "THESE ARE THE COLUMN NAMES." I want to apply multiple filters to multiple columns.
Like
(( B1,B2... 50 > 3) & (A1, A2....A50 > 0.1) & (C1,C2....C50 < 1))
And I want to remove each row which does not fit the criteria from the table.
Please help.
0 Comments
Answers (2)
Bob Thompson
on 19 Feb 2019
You can do this with logic indexing.
data = ... % Your main data block
conditions = data(:,1)>0.1 & data(:,2)>3 & data(:,3)<1;
data2 = dat(conditions,:);
4 Comments
Bob Thompson
on 19 Feb 2019
Edited: Bob Thompson
on 19 Feb 2019
I'm a bit confused on what you mean by 'columns.' Do you have multiple T variables? You're T1 variable already contains multiple rows and columns which are being filtered, so I'm going to assume you're referring to different tables, rather than different columns within a table. If this is incorrect, please feel free to say so and explain where I am wrong.
If your data is in multiple variables, is it possible to put it into a larger matrix? Possibly using a cell matrix, where each cell is a variable? This will make it possible to index through the different variables, and then you are able to use a for loop to filter all of the data.
T = {T1,T3};
for i = 1:size(T,2)
T2{i} = T{i}(T{i}{:,2}>3 & T{i}{:,3}<0.1,:);
end
EDIT**
If you do actually mean 'columns' of a single variable, and all of your data is just doubles, I would strongly advise turning your cell array into a regular matrix.
T1 = cell2mat(T1);
This will make the logical operations much easier, and you will be less likely to run into issues with concatonating that may occur with cells.
I think that you will be better off keeping your logic checks as single rows, and using more checks for a single statement. This is my personal opinion, but I think it is better because while the statements themselves do get longer and more convoluted it is easier to make appropriate adjustments to columns or conditions to be met.
T2 = T1(T1(:,7)>3 & T1(:,8)>3 & T1(:,2)<0.1 & T1(:,3)<0.1,:);
Sean de Wolski
on 19 Feb 2019
Edited: Sean de Wolski
on 19 Feb 2019
Tested in R2018b. May need to convert strings to cellstrs in older releases.
%% Sample data
Data = array2table(rand(2,200));
names = ["A";"B";"C";"D"]+(1:50);
Data.Properties.VariableNames = names(:);
Thresh = array2table(rand(1,4), 'VariableNames',["A"; "B"; "C"; "D"]);
%% engine
rix = false(height(Data), 1); % Remove rows
for ii = 1:width(Thresh)
% Corresponding variables
vars = startsWith(Data.Properties.VariableNames, Thresh.Properties.VariableNames{ii});
% previously set to remove or matches criteria here.
rix = rix | any(Data{:, vars} > Data{1, ii}, 2);
end
Data(rix,:) = [] %Remove them
Here I use > for every operation. You could use an operation function handle as well if necessary. Store the operations in a cell array {@le, @ge, @lt} and then use the corresponding one on that iteration of the loop.
0 Comments
See Also
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!