Question
Asked By – saroele
Why does pandas make a distinction between a Series
and a single-column DataFrame
?
In other words: what is the reason of existence of the Series
class?
I’m mainly using time series with datetime index, maybe that helps to set the context.
Now we will see solution for issue: What is the difference between a pandas Series and a single-column DataFrame?
Answer
Quoting the Pandas docs
pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
So, the Series is the data structure for a single column of a DataFrame
, not only conceptually, but literally, i.e. the data in a DataFrame
is actually stored in memory as a collection of Series
.
Analogously: We need both lists and matrices, because matrices are built with lists. Single row matricies, while equivalent to lists in functionality still cannot exist without the list(s) they’re composed of.
They both have extremely similar APIs, but you’ll find that DataFrame
methods always cater to the possibility that you have more than one column. And, of course, you can always add another Series
(or equivalent object) to a DataFrame
, while adding a Series
to another Series
involves creating a DataFrame
.
This question is answered By – PythonNut
This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0