How do I find rows based on 2 column variables, then only retain the previous 3 rows?

5 views (last 30 days)
After conducting google searches and looking in some manuals I have come up with the idea that this probably has something to do with indexing, but assume I am the newbie I am, and assume I prefer for-loop code like:
-
for subj=1:length(subjects) % loop through subjects
% This specifies the path where the current subject's data is located
data_path=[home_path subjects{subj} '\'];
% Load subject data file
Filename=[home_path subjects{subj} '\' subjects{subj} '_' 'attnresp_only' '.txt'];
% Visually verify table
B=readtable([home_path subjects{subj} '\' subjects{subj} '_' 'attnresp_only' '.txt'],'ReadVariableNames',true);
% Remove rows with no response
B.response=string(B.response);
B=B(~contains(B.response,'0'),:);
% Rows to retain for average response time estimates ****THIS IS WHERE I AM****
RowsToRetain=...;
% Save table
writetable(RowsToRetain,[home_path subjects{subj} '\' subjects{subj} '_' 'attnresp_only_incorrect' '.txt'],'Delimiter','\t','WriteVariableNames',true);
end
-
Assume there are anywhere from 5 to 8 column variables in my array with up to 800 rows. I need to locate rows based on 2 column varibales being true, then only keep the previous 3 rows. For example, if row n meets the criteria, I need to keep rows n-1, n-2, and n-3 for every row n. Therefore, I assume the code is something like:
-
% Rows to retain for average response time estimates
RowsToRetain=B(contains(B.condition, 'infrequent' .... and some other stuff that incorperates B.correct,'0'
-
Again, I am extremely inexperienced, and tend to write ugly and slow code that uses brute force. There might not be a way to do what I am trying to do using what has worked for me before, and I understand that. This is an issue I have been beating my head against a desk at for quite some time, and my advisor is getting less happy with me by the day. Note, this is not the only time I need to do this type of search, so generic responses with arbitrary table names or variables within a table are quite welcome. To summarize, Me=Caveman and You=SpaceMan. Can You help Me do Me things?
-
I appologize if this seems easy, but for me, it is not.

Accepted Answer

dpb
dpb on 23 Jan 2023
Edited: dpb on 24 Jan 2023
% Rows to retain for average response time estimates
iKeep=B(contains(B.condition, 'infrequent') & B.correct==0);
will be your indexing vector (logical vector) of the rows that meet the joint condition.
To keep those and the neighbors precding, it would be possible to write an indexing expression that incorporates all the desired locations as a single vector, but the brute-force way would be to iterate over the returned rows...to make that a little more convenient, convert the logical vector to the indices--here's a place where find() does have its place...
iKeep=find(B(contains(B.condition, 'infrequent') & B.correct==0)); % return indices that satisfy
tOut=[];
for i=1:numel(iKeep)
tOut=[tOut;tOut(max(1,iKeep(i)-3):iKeep(i),:)];
end
The above keeps row n as well; wasn't clear for sure whether it is supposed to be in/out of the result. If don't want it as well, then you can subtract one from array iKeep and one less from the precount selection. NB: the test to make sure don't try to go back before the beginning of the file; you can deal with that with special case if it's not allowed to have a set less than the full number.
As far as the generic case, the above uses the specific variables you mentioned; read the documentation section on addressing data in tables that explains fully how to use variables or indexing to select variables by any combination of indices, names, type, ... and you can let the user select which variables are of interest and what the conditions are to test for dynamically.
BTW, if you have data such as your condition variable above that is a list of possible conditions, seriously consider converting those from cellstr() or string() to categorical.
For such tasks as these, undoubtedly you should also be reading up on findgroups and groupsummary and friends...if your task is to provide such results from a dataset by conditions, these should be well-known traveling companions, as well as rowfun
  1 Comment
Samuel Birkholz
Samuel Birkholz on 24 Jan 2023
Thank you dpb. For some reason everyone I come across with that tag is extremely helpful. I did some quick checks after running your origional code, starting with the smaller steps and then combining them. I may have added some confusion when I said "array" instead of "table", which shows my inexperience and lack of appropriate language use. This is my result:
-
% This part will create a table with the infrequent misses and up to 3 previous trials
% Select rows to keep
iKeep=find(contains(B.condition, 'infrequent') & B.correct==0); % return indices that satisfy
tOut=[];
for i=1:numel(iKeep)
tOut=[tOut,max(1,iKeep(i)-3):iKeep(i)];
end
infreqMiss=B(tOut,:);
-
FIREWORKS! This does almost exactly what I needed. However I ended up with duplicates, which I removed using:
-
% Remove duplicates
infreqMiss=unique(infreqMiss);
-
From here I was able to create and save 2 tables per participant with EXACTLY what I need. I can now move on to the next part of my data analysis plan! You have helped this Caveman learn to use a tool more effectively and efficiently. Much appreciated!

Sign in to comment.

More Answers (0)

Categories

Find more on Particle & Nuclear Physics in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!