Question
Asked By – London guy
I have a very large data frame in python and I want to drop all rows that have a particular string inside a particular column.
For example, I want to drop all rows which have the string “XYZ” as a substring in the column C of the data frame.
Can this be implemented in an efficient way using .drop() method?
Now we will see solution for issue: How to drop rows from pandas data frame that contains a particular string in a particular column? [duplicate]
Answer
pandas has vectorized string operations, so you can just filter out the rows that contain the string you don’t want:
In [91]: df = pd.DataFrame(dict(A=[5,3,5,6], C=["foo","bar","fooXYZbar", "bat"]))
In [92]: df
Out[92]:
A C
0 5 foo
1 3 bar
2 5 fooXYZbar
3 6 bat
In [93]: df[~df.C.str.contains("XYZ")]
Out[93]:
A C
0 5 foo
1 3 bar
3 6 bat
This question is answered By – Brian from QuantRocket
This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0