Find locations of repeated values?

Question

Jacqueline on 15 Jul 2013

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values

So, I have this function that takes a set of data and finds if there are values that repeat for more than 300 seconds in that data set...\

function FindRepetition(TruckVariableName)

setpref('Internet','SMTP_Server','lamb.corning.com');

data1 = (TruckVariableName);
x = length(TruckVariableName);
data = reshape(data1, 1, x); 
datarep = ~diff(data) & data(2:x) ~= 0; %binary data -- 1 means repeats, 0 means different, excludes repetitive zeros
%if the difference in the data at each point is zero, and if the data at
%that point isn't itself zero, return true. 2:x means difference array is equal to the length of the data array, matrix dimensions must be the same or &
%cannot be used
datarepstr = num2str(datarep); %convert to string
s = regexprep(datarepstr,' ',''); %remove spaces
[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start
l = cellfun('length',runs); %find the length of each run
y = l > 300;
if any(y) %if any run is longer than 5 minutes, display message
  %sendmail('johnsonlj2@corning.com', '2011 KENWORTH ISX15','A data fault has been detected - Prolonged data repetition');
  disp('--An error has occurred - Prolonged data repetition.');
  disp('Errors occurred at'); 
end
end

I want to find WHERE those repeated values start in that set of data. I tried disp(find(y));, but that finds the locations of the data set y, which is not the original data set. Anyone know how I can find the locations of data1 where the data repeats for more than 300 seconds?

2 Comments
Show NoneHide None

Cedric on 15 Jul 2013

Edited: Cedric on 15 Jul 2013

Could you provide a sample dataset or the content of this TruckVariableName that you pass to your function?

Jacqueline on 15 Jul 2013

One of my variables is engine speed, and the data is collected for over 95,000 seconds. A chunk of the data may look like this...

1055.25000000000 777.250000000000 771.750000000000 1112.37500000000 1151.37500000000 1447 1447 1447 1447 1447 1447 1447 1447 668.625000000000 803.750000000000 850.250000000000 693.625000000000 1069.37500000000 868.500000000000 985.875000000000 1085.87500000000 1148 1065.62500000000 978.250000000000 885.750000000000 723.125000000000 638.125000000000 678.500000000000 807.500000000000 692.750000000000 814.875000000000

See how 1447 is repeated? Say that was repeating for more than 300 seconds. My script would use the ~diff function and replace the non-repeating numbers with 0s and the repeating numbers with 1s. Then it finds were the ones repeat for more than 300 seconds. When I use find(y) though, it finds locations but they don't correspond to the original data set

Sign in to comment.

Sign in to answer this question.

Answer 1

Cedric on 15 Jul 2013

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values#answer_91801

Edited: Cedric on 15 Jul 2013

Open in MATLAB Online

I think that you can use two approaches. I'll illustrate with a simple example: say we have the following data

>> data = [7 8 8 8 8 6 6 7 8 7 7 7] ;

and we want to get blocks of repeating values with at least 3 elements.

1. Based on your REGEXP method, you would indeed look for the position of streams of 1's larger than a given value.

 >> rep = ~diff(data)                            % Add other components if needed.
 rep =
     0     1     1     1     0     1     0     0     0     1     1
 >> repStr = sprintf('%d', rep)
 repStr =
     01110100011
 >> start = regexp(repStr, '1{2,}', 'start')     % 3 similar values -> 2 
 start =                                         % repetitions.
     2    10

2. Without conversion to string and REGEXP:

 >> buffer = [true, diff(data)~=0]
 buffer =
     1     1     0     0     0     1     0     1     1     1     0     0
 >> groupStart = find(buffer)
 groupStart =
     1     2     6     8     9    10
 >> groupId = cumsum(buffer)
 groupId =
     1     2     2     2     2     3     3     4     5     6     6     6
 >> groupSize = accumarray(groupId.', ones(size(groupId))).'
 groupSize =
     1     4     2     1     1     3
 >> start = groupStart(groupSize > 2)
 start =
     2    10

EDIT: note that the 2nd method is more than 5 times faster than the 1st on large datasets.

3 Comments
Show 1 older commentHide 1 older comment

Cedric on 15 Jul 2013

Edited: Cedric on 15 Jul 2013

Open in MATLAB Online

In your command window, type

doc sprintf

then, in the SPRINTF documentation, look up formatSpec, which describes all the format conversion specifiers. %d is for integer, which means that elements of rep are interpreted as integers and converted to string as such.

Jacqueline on 15 Jul 2013

Thank you!

Sign in to comment.

Answer 2

Muthu Annamalai on 15 Jul 2013

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/82088-find-locations-of-repeated-values#answer_91795

Open in MATLAB Online

Guessing from reading the code, and the comments in the code itself, you are looking for the variable, startindex

[startindex,runs] = regexp(s,'1+','start','match'); %find all runs and the point where they start

So just add this to your return value from the function, and you should be all set.

1 Comment
Show -1 older commentsHide -1 older comments

Jacqueline on 15 Jul 2013

That finds the starting point of where there are more than one 1s in a data set of 1s and zeros. The length of that string is different than my original string, which is where I need to find the locations of the repeating values

Sign in to comment.

Find locations of repeated values?

2 Comments
Show NoneHide None

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

Find locations of repeated values?

2 Comments Show NoneHide None

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

2 Comments
Show NoneHide None

3 Comments
Show 1 older commentHide 1 older comment

1 Comment
Show -1 older commentsHide -1 older comments