

Pandas is one of the most important libraries in Python for Data Analysis, and Data Science. I have consolidated the most important and basic pandas tasks that are essential for every Data Analyst and Data Scientist to know.
Task Description | Tasks Steps |
Check for duplicates in a data frame | df.duplicated() |
Rename columns | df.rename(columns=column_dictionary) |
Lower column names | df.columns = df.columns.str.lower() |
Change the data format of a date column | df['date_column_name'] = df['date_column_name'].dt.strftime('"%y-%m-%d"') |
Seperate date into year, month, day and year_month |
df['year'] = df['date_column_name'].dt.year df['month'] = df['date_column_name'].dt.month df['day'] = df['date_column_name'].dt.day df['year_month'] = df['date_column_name'].dt.to_period('M') |
|
Sources