How can I delete a row based on the value of the previous row?

23 views (last 30 days)
I have a file containing one column and a large number of rows. I want to delete any row whose value is less than 2 greater than the previous row (ie if row 1 is 10, and row 2 is 11, then delete row 2. If row 1 is 10, and row 2 is 12.1, then don't delete it). How can I do this? I know how to delete specificied rows with something like
rowstodelete = xxxxxx
A(rowstodelete,:) = [ ];
But I don't know how to figure out which rows need to be deleted (ie what do I need to put in for xxxxx.) I was thinking maybe using logical indexing or if-else statments, but not really sure how. Open to other options too! Thanks in advance!
  2 Comments
Walter Roberson
Walter Roberson on 31 Jul 2022
Suppose you have three rows, with values 10, 11.5, 13.3 . The second row is not at least 2 greater than the first. The third row is not at least 2 greater than the second. Do I understand correctly that you would delete the second and the third row both?
Or should the deletion in a sense be carried out "immediately", so that after the 11.5 is deleted, the 13.3 would be compared to the now-previous 10 ?
Kellie Wilson
Kellie Wilson on 31 Jul 2022
Good question - I want the latter. So I would want to compare the 13.3 to the now previous 10 and thus keep it.

Sign in to comment.

Answers (5)

Bruno Luong
Bruno Luong on 31 Jul 2022
Edited: Bruno Luong on 31 Jul 2022
This avoids deletetion which like growsing array wouls kill the performance.
A=[10, 11.5, 13.3 13.5 13.7 17];
keep = false(size(A));
b = -Inf;
for i=1:length(A)
if A(i) >= b
keep(i) = true;
b = A(i) + 2;
end
end
A = A(keep)
A = 1×3
10.0000 13.3000 17.0000

Matt J
Matt J on 31 Jul 2022
rowstodelete = diff([inf;A])<2;
  1 Comment
Kellie Wilson
Kellie Wilson on 31 Jul 2022
This works well for the first scenario that Walter laid out, where if I had 3 rows: 10, 11.5, 13.3, then it would delete 11.5 and 13.3. But I just want to delete the 11.5, and then keep 13.3 since it would be more than 2 away from the 10 that is now right before it.

Sign in to comment.


Walter Roberson
Walter Roberson on 31 Jul 2022
Edited: Walter Roberson on 31 Jul 2022
Not a full algorithm, but something to think about
rng(1234)
data = randi(99, 1, 20) %row vector
data = 1×20
19 62 44 78 78 27 28 80 95 87 36 50 68 71 37 56 50 2 77 88
G = tril(data.'-2 >= data)
G = 20×20 logical array
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0
pos = sum(cumprod(~G,1),1)+1
pos = 1×20
2 4 4 8 8 8 8 9 21 21 12 13 14 19 16 19 19 19 20 21
find the first 1 in the first column of G; it is at row 2. Entry 2 is the first entry in data that is at least 2 greater than entry 1. Cross check: 62 >= 19+2. Let C = 2 (column 2). pos(1) = 2 -- the required value of the first 1
Look in column C (#2) for the first 1; it is at entry 4. Entry 4 is the first entry in data that is at least 2 greater than entry 2. Cross-check: 78 >= 62+2. Let C = 4 (column 4). pos(2) = 4 -- the required value of the first 1
Look in column C (#4) for the first 1; it is at entry 8. Entry 8 is the first entry in data that is at least 2 greater than entry 4. Cross-check: 80 >= 78+2. Let C = 8 (column 8). pos(4) = 8 -- the required value of the first 1
Look in column C (#8) for the first 1. pos(8) = 9, entry 9 is the first entry, 95 >= 80+2
and so on. The G matrix is the details, the pos vector is the condensed information, the position you need. So you can just read the positions out of pos. pos(1) gives the next index to look at in pos, pos() at that gives the next index, and so on.
When you get an index that is 1 greater than the number of elements in data then you have reached the end, there are no more positions that satisfy the condition.

Matt J
Matt J on 31 Jul 2022
It'll have to be done with a loop,
A=[10, 11.5, 13.3 13.5 13.7 17];
i=2;
while i<=numel(A)
if A(i)<A(i-1)+2
A(i)=[];
else
i=i+1;
end
end
A
A = 1×3
10.0000 13.3000 17.0000

Bruno Luong
Bruno Luong on 31 Jul 2022
Edited: Bruno Luong on 31 Jul 2022
If your data is sorted you could try this
A = 1:0.1:10
A = 1×91
1.0000 1.1000 1.2000 1.3000 1.4000 1.5000 1.6000 1.7000 1.8000 1.9000 2.0000 2.1000 2.2000 2.3000 2.4000 2.5000 2.6000 2.7000 2.8000 2.9000 3.0000 3.1000 3.2000 3.3000 3.4000 3.5000 3.6000 3.7000 3.8000 3.9000
A = uniquetol(A,2*(1-eps),'DataScale',1)
A = 1×5
1 3 5 7 9

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!