Fix Python – Detect if a NumPy array contains at least one non-numeric value?

Question

Asked By – Salim Fadhley

I need to write a function which will detect if the input contains at least one value which is non-numeric. If a non-numeric value is found I will raise an error (because the calculation should only return a numeric value). The number of dimensions of the input array is not known in advance – the function should give the correct value regardless of ndim. As an extra complication the input could be a single float or numpy.float64 or even something oddball like a zero-dimensional array.

The obvious way to solve this is to write a recursive function which iterates over every iterable object in the array until it finds a non-iterabe. It will apply the numpy.isnan() function over every non-iterable object. If at least one non-numeric value is found then the function will return False immediately. Otherwise if all the values in the iterable are numeric it will eventually return True.

That works just fine, but it’s pretty slow and I expect that NumPy has a much better way to do it. What is an alternative that is faster and more numpyish?

Here’s my mockup:

def contains_nan( myarray ):
    """
    @param myarray : An n-dimensional array or a single float
    @type myarray : numpy.ndarray, numpy.array, float
    @returns: bool
    Returns true if myarray is numeric or only contains numeric values.
    Returns false if at least one non-numeric value exists
    Not-A-Number is given by the numpy.isnan() function.
    """
    return True

Now we will see solution for issue: Detect if a NumPy array contains at least one non-numeric value?


Answer

This should be faster than iterating and will work regardless of shape.

numpy.isnan(myarray).any()

Edit: 30x faster:

import timeit
s = 'import numpy;a = numpy.arange(10000.).reshape((100,100));a[10,10]=numpy.nan'
ms = [
    'numpy.isnan(a).any()',
    'any(numpy.isnan(x) for x in a.flatten())']
for m in ms:
    print "  %.2f s" % timeit.Timer(m, s).timeit(1000), m

Results:

  0.11 s numpy.isnan(a).any()
  3.75 s any(numpy.isnan(x) for x in a.flatten())

Bonus: it works fine for non-array NumPy types:

>>> a = numpy.float64(42.)
>>> numpy.isnan(a).any()
False
>>> a = numpy.float64(numpy.nan)
>>> numpy.isnan(a).any()
True

This question is answered By – Paul

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0