# Setting up Quality Control on dataset

7 views (last 30 days)
Sara on 14 Feb 2011
Hi,
I am trying to write a quality control program for a large dataset. I need to check for three conditions: 1) If values are outside a certain limit (eg 0-1000), 2) If there is too large a jump between consecutive values (eg >250) 3) If there is no change of values within a certain time frame (eg 4 hours (each row will be timestamped))
I have absolutely no idea how to write this, except that I have a vague idea of writing a loop and a list of nested if statements to create a second array of 0 (no errors) 1, 2 or 3. But I don't know how to compare subsequent values or check blocks of data for any change.
I'm not asking for someone to write my code for me, but does anybody have any ideas?
Thanks,
S.

Walter Roberson on 14 Feb 2011
out_of_range = V < 0 | D > 1000
if any(out_of_range)
....
end
diffs = [diff(V) 0];
bigjumps = abs(diffs) > 250
if any(bigjumps)
...
end
ldiff = [true ~logical(diffs)];
runstarts = strfind(ldiff, [1 0]);
runends = strfind(ldiff, [0 1]);
runlengths = runends - runstarts;
if any(runlengths > ...)
note that the code for detecting the runs of identical values would have to be adjusted if the sampling is not at even intervals, as then number of samples in the run would not translate directly to time of the run.
You also need to take in to account the possibility that there is noise in the sampling system, and thus that you might want to count a difference of less than some tolerance as being "the same value".

Paulo Silva on 14 Feb 2011
doc findpeaks
doc diff
So you can find values above certain limits with findpeaks and build a second array with the diff function and also findpeaks on that one.

### Categories

Find more on Descriptive Statistics in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!