Pandas
pd.Series
Section titled “pd.Series”# constructorpd.Series([1, 2, 3, 4, 5])pd.Series([30, 35, 40], index=['2015 Sales', '2016 Sales', '2017 Sales'], name='Product A')
boolean operations
Section titled “boolean operations”# s1, s2 are series with dtype: bools1 | s2 # ors1 & s2 # and
pd.Series.describe()
Section titled “pd.Series.describe()”It is type-aware, meaning that its output changes based on the data type of the input.
other operations
Section titled “other operations”s.mean()s.unique()s.value_counts()
pd.DataFrame
Section titled “pd.DataFrame”# constructorpd.DataFrame({'my-column-1': [50, 21], 'my-column-2': [131, 2]})pd.DataFrame({'my-column-1': [50, 21], 'my-column-2': [131, 2]}, index=['my-index-1', 'my-index-2'])
# count rowslen(df)
get column
Section titled “get column”# access column# all are equivalentdf.column_namedf['column_name']df.loc[:, 'column_name']
set column
Section titled “set column”# assign fixed value to columndf.column_name = 'fixed_value'
# assign any iterable to column# length of iterable must match length of `df`df.column_name = range(len(df), 0, -1)
get pd.Series
with dtype: bool
Section titled “get pd.Series with dtype: bool”# true if value null / not null respectivelydf.column_name.isnull()df.column_name.notnull()
# true if value is among the collectiondf.column_name.isin(['value_1', 'value_8', 'value_3'])
other operations
Section titled “other operations”df.shapedf.head()df.describe()pd.set_option('display.max_rows', 500)pd.set_option('display.max_columns', 500)pd.set_option('display.width', 1000)reviews.set_index('column_name') # assign column as index
Others
Section titled “Others”df = pd.read_csv("../data/data.csv")df = pd.read_csv("../data/data.csv", index_col=0) # if index is part of csv