Setting up Quality Control on dataset
10 views (last 30 days)
I am trying to write a quality control program for a large dataset. I need to check for three conditions: 1) If values are outside a certain limit (eg 0-1000), 2) If there is too large a jump between consecutive values (eg >250) 3) If there is no change of values within a certain time frame (eg 4 hours (each row will be timestamped))
I have absolutely no idea how to write this, except that I have a vague idea of writing a loop and a list of nested if statements to create a second array of 0 (no errors) 1, 2 or 3. But I don't know how to compare subsequent values or check blocks of data for any change.
I'm not asking for someone to write my code for me, but does anybody have any ideas?
Walter Roberson on 14 Feb 2011
out_of_range = V < 0 | D > 1000
diffs = [diff(V) 0];
bigjumps = abs(diffs) > 250
ldiff = [true ~logical(diffs)];
runstarts = strfind(ldiff, [1 0]);
runends = strfind(ldiff, [0 1]);
runlengths = runends - runstarts;
if any(runlengths > ...)
note that the code for detecting the runs of identical values would have to be adjusted if the sampling is not at even intervals, as then number of samples in the run would not translate directly to time of the run.
You also need to take in to account the possibility that there is noise in the sampling system, and thus that you might want to count a difference of less than some tolerance as being "the same value".