Problem 42485. Eliminate Outliers Using Interquartile Range
Given a vector with your "data" find the outliers and remove them.
To determine whether data contains an outlier:
- Identify the point furthest from the mean of the data.
- Determine whether that point is further than 1.5*IQR away from the mean.
- If so, that point is an outlier and should be eliminated from the data resulting in a new set of data.
- Repeat steps to determine if new data set contains an outlier until dataset no longer contains outlier.
IQR: Interquartile Range is the range between the median of the upper half and the median of the lower half of data: http://www.wikihow.com/Find-the-IQR
To find an outlier by hand:
Data: [ 53 55 51 50 60 52 ] we will check for outliers.
Sorted: [ 50 51 52 53 55 60 ] where the mean is 53.5 and 60 is the furthest away (60-53.5 > 53.5-50).
1.5 * IQR = 1.5 * (55-51) = 6
Since 60-53.5 = 6.5 > 6, 60 is an outlier.
New Data: [ 53 55 51 50 52 ] we will check for outliers.
New Data Sorted: [ 50 51 52 53 55 ] where the mean is 52.2 and 55 is the furthest away.
1.5* IQR = 1.5 * (54-50.5) = 4.5
Since 55-52.2 = 2.8 < 4.5, 55 is NOT an outlier.
Our original data had one outlier, which was 60.
Example:
Input data = [53 55 51 50 60 52]
Output new_data = [53 55 51 50 52]
since 60 is an outlier, it is removed
*Note: A number may be repeated within a dataset that is an outlier. You should not remove all instances, but remove only the first instance and check the new dataset to determine whether this number is still an outlier (see 5th test suite).*
Solution Stats
Problem Comments
-
7 Comments
Fixed the typos, thanks for noticing. I'm not sure why your code may not be working. I'd suggest checking you aren't using something like iqr which is only in the Statistics (I think?) Toolbox.
That was it. I wish that Cody would give a warning that an unsupported function was being used. Better yet, I wish the Cody computer had all the toolboxes activated.
But why, in your example< do you say that 1.5 * (54-50.5)= 4.5 ? shouldn't this be 5.25?
Solution Comments
Show commentsProblem Recent Solvers26
Suggested Problems
-
272 Solvers
-
1406 Solvers
-
245 Solvers
-
Getting the absolute index from a matrix
248 Solvers
-
Area of an equilateral triangle
6391 Solvers
More from this Author1
Problem Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!