Fix Python – Get pandas.read_csv to read empty values as empty string instead of nan

Question

Asked By – BrenBarn

I’m using the pandas library to read in some CSV data. In my data, certain columns contain strings. The string "nan" is a possible value, as is an empty string. I managed to get pandas to read “nan” as a string, but I can’t figure out how to get it not to read an empty value as NaN. Here’s sample data and output

One,Two,Three
a,1,one
b,2,two
,3,three
d,4,nan
e,5,five
nan,6,
g,7,seven

>>> pandas.read_csv('test.csv', na_values={'One': [], "Three": []})
    One  Two  Three
0    a    1    one
1    b    2    two
2  NaN    3  three
3    d    4    nan
4    e    5   five
5  nan    6    NaN
6    g    7  seven

It correctly reads “nan” as the string “nan’, but still reads the empty cells as NaN. I tried passing in str in the converters argument to read_csv (with converters={'One': str})), but it still reads the empty cells as NaN.

I realize I can fill the values after reading, with fillna, but is there really no way to tell pandas that an empty cell in a particular CSV column should be read as an empty string instead of NaN?

Now we will see solution for issue: Get pandas.read_csv to read empty values as empty string instead of nan


Answer

I added a ticket to add an option of some sort here:

https://github.com/pydata/pandas/issues/1450

In the meantime, result.fillna('') should do what you want

EDIT: in the development version (to be 0.8.0 final) if you specify an empty list of na_values, empty strings will stay empty strings in the result

This question is answered By – Wes McKinney

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0