Fix Python – What is the difference between size and count in pandas?

Question

Asked By – Donovan Thomson

That is the difference between groupby("x").count and groupby("x").size in pandas ?

Does size just exclude nil ?

Now we will see solution for issue: What is the difference between size and count in pandas?


Answer

size includes NaN values, count does not:

In [46]:
df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})
df

Out[46]:
   a   b         c
0  0   1  1.067627
1  0   2  0.554691
2  1   3  0.458084
3  2   4  0.426635
4  2 NaN -2.238091
5  2   4  1.256943

In [48]:
print(df.groupby(['a'])['b'].count())
print(df.groupby(['a'])['b'].size())

a
0    2
1    1
2    2
Name: b, dtype: int64

a
0    2
1    1
2    3
dtype: int64 

This question is answered By – EdChum

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0