How to find these rows in a dataset directly before a sequence of NaNs?

1 view (last 30 days)
Hey people. I have a huge dataset which looks as follows
x y z p
2 4 1 20
2 4 2 1
2 4 3 NaN
2 4 4 NaN
2 4 5 NaN
3 5 1 5
3 5 2 1
3 5 3 52
3 5 4 22
3 5 5 NaN
6 3 1 10
6 3 2 4
6 3 3 1
6 3 4 NaN
6 3 5 NaN
(...)
As u see the dataset is sorted according to the first three columns. I would like to extract only the lines/rows which are directly above the NaN. I find a way to remove all the lines containing a NaN by using an index.
p(isnan(p))= -999;
index = 1:size(p,1);
index = index(p > -999);
x = x(index);
y = y(index);
z = z(index);
p = p(index);
But I have not really a clue how to remove the remaining lines. Does someone have an idea/hint?
so the solution for the example would be:
2 4 2 1
3 5 4 22
6 3 3 1
Thanks in advance! P

Accepted Answer

the cyclist
the cyclist on 5 Feb 2016
Edited: the cyclist on 5 Feb 2016
% Find rows that are not NaN themselves, but are followed by a NaN line. (Not sure how you wanted to handle the final row. This code keeps it, if it is not NaN.)
indexToKeep = [not(isnan(p)) & isnan([p(2:end); NaN])];
x = x(indexToKeep)
y = y(indexToKeep)
z = z(indexToKeep)
p = p(indexToKeep)

More Answers (1)

Joseph Cheng
Joseph Cheng on 5 Feb 2016
Edited: Joseph Cheng on 5 Feb 2016
so you can create something like this:
sampdata = [2 4 1 20
2 4 2 1
2 4 3 NaN
2 4 4 NaN
2 4 5 NaN
3 5 1 5
3 5 2 1
3 5 3 52
3 5 4 22
3 5 5 NaN
6 3 1 10
6 3 2 4
6 3 3 1
6 3 4 NaN
6 3 5 NaN];
pmask = isnan(sampdata(:,4));
dpmask = diff(pmask);
% index location of a number to Nan transition ==1
% index location of a Nan to a number transition ==-1
% 0's are continuous types
outputdata = sampdata(dpmask==1,:)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!