Fix Python – pandas loc vs. iloc vs. at vs. iat?

Recently began branching out from my safe place (R) into Python and and am a bit confused by the cell localization/selection in Pandas. I’ve read the documentation but I’m struggling to understand the practical implications of the various localization/selection options.
Is there a reason why I should ever use .loc or .iloc over at, and iat or vice….

Fix Python – Ignoring NaNs with str.contains

I want to find rows that contain a string, like so:

However, this fails because some elements are NaN:

ValueError: cannot index with vector containing NA / NaN values

So I resort to the obfuscated

Is there a better way?

Fix Python – Efficient way to apply multiple filters to pandas DataFrame or Series

I have a scenario where a user wants to apply several filters to a Pandas DataFrame or Series object. Essentially, I want to efficiently chain a bunch of filtering (comparison operations) together that are specified at run-time by the user.

The filters should be additive (aka each one applied should narrow results).
I’m currently using reindex()….

Fix Python – python pandas: apply a function with arguments to a series

I want to apply a function with arguments to a series in python pandas:
x = my_series.apply(my_function, more_arguments_1)
y = my_series.apply(my_function, more_arguments_2)

The documentation describes support for an apply method, but it doesn’t accept any arguments. Is there a different method that accepts arguments? Alternatively, am I mi….

Fix Python – How to add hovering annotations to a plot

I am using matplotlib to make scatter plots. Each point on the scatter plot is associated with a named object. I would like to be able to see the name of an object when I hover my cursor over the point on the scatter plot associated with that object. In particular, it would be nice to be able to quickly see the names of the points that are outlier….

Fix Python – Counting unique values in a column in pandas dataframe like in Qlik?

If I have a table like this:
df = pd.DataFrame({
‘hID’: [101, 102, 103, 101, 102, 104, 105, 101],
‘dID’: [10, 11, 12, 10, 11, 10, 12, 10],
‘uID’: [‘James’, ‘Henry’, ‘Abe’, ‘James’, ‘Henry’, ‘Brian’, ‘Claude’, ‘James’],
‘mID’: [‘A’, ‘B’, ‘A’, ‘B’, ‘A’, ‘A’, ‘A’, ‘C’]

I can do count(distinct hID) in Qlik to ….