Asked By – London guy
I have a very large data frame in python and I want to drop all rows that have a particular string inside a particular column.
For example, I want to drop all rows which have the string “XYZ” as a substring in the column C of the data frame.
Can this be implemented in an efficient way using .drop() method?
Now we will see solution for issue: How to drop rows from pandas data frame that contains a particular string in a particular column? [duplicate]
pandas has vectorized string operations, so you can just filter out the rows that contain the string you don’t want:
In : df = pd.DataFrame(dict(A=[5,3,5,6], C=["foo","bar","fooXYZbar", "bat"])) In : df Out: A C 0 5 foo 1 3 bar 2 5 fooXYZbar 3 6 bat In : df[~df.C.str.contains("XYZ")] Out: A C 0 5 foo 1 3 bar 3 6 bat
This question is answered By – Brian from QuantRocket
This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0