Fix Python – Most efficient way of making an if-elif-elif-else statement when the else is done the most?

I’ve got a in if-elif-elif-else statement in which 99% of the time, the else statement is executed:
if something == ‘this’:
elif something == ‘that’:
elif something == ‘there’:

This construct is done a lot, but since it goes over every condition before it hits the else I have….

Fix Python – Why is it slower to iterate over a small string than a small list?

I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It’s almost 1.35 times as much time.
>>> from timeit import timeit
>>> timeit(“[x for x in ‘abc’]”)
>>> timeit(“[x for x i….

Fix Python – Performance of Pandas apply vs np.vectorize to create new column from existing columns

I am using Pandas dataframes and want to create a new column as a function of existing columns. I have not seen a good discussion of the speed difference between df.apply() and np.vectorize(), so I thought I would ask here.
The Pandas apply() function is slow. From what I measured (shown below in some experiments), using np.vectorize() is 25x fas….

Fix Python – Does pandas iterrows have performance issues?

I have noticed very poor performance when using iterrows from pandas.
Is it specific to iterrows and should this function be avoided for data of a certain size (I’m working with 2-3 million rows)?
This discussion on GitHub led me to believe it is caused when mixing dtypes in the dataframe, however the simple example below shows it is there even wh….

Fix Python – Cost of exception handlers in Python

In another question, the accepted answer suggested replacing a (very cheap) if statement in Python code with a try/except block to improve performance.
Coding style issues aside, and assuming that the exception is never triggered, how much difference does it make (performance-wise) to have an exception handler, versus not having one, versus having….

Fix Python – Fast check for NaN in NumPy

I’m looking for the fastest way to check for the occurrence of NaN (np.nan) in a NumPy array X. np.isnan(X) is out of the question, since it builds a boolean array of shape X.shape, which is potentially gigantic.
I tried np.nan in X, but that seems not to work because np.nan != np.nan. Is there a fast and memory-efficient way to do this at all?