Fix Python – Is there a numpy builtin to reject outliers from a list

Question

Asked By – aaren

Is there a numpy builtin to do something like the following? That is, take a list d and return a list filtered_d with any outlying elements removed based on some assumed distribution of the points in d.

import numpy as np

def reject_outliers(data):
    m = 2
    u = np.mean(data)
    s = np.std(data)
    filtered = [e for e in data if (u - 2 * s < e < u + 2 * s)]
    return filtered

>>> d = [2,4,5,1,6,5,40]
>>> filtered_d = reject_outliers(d)
>>> print filtered_d
[2,4,5,1,6,5]

I say ‘something like’ because the function might allow for varying distributions (poisson, gaussian, etc.) and varying outlier thresholds within those distributions (like the m I’ve used here).

Now we will see solution for issue: Is there a numpy builtin to reject outliers from a list


Answer

This method is almost identical to yours, just more numpyst (also working on numpy arrays only):

def reject_outliers(data, m=2):
    return data[abs(data - np.mean(data)) < m * np.std(data)]

This question is answered By – eumiro

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0