Python Pandas Data Analysis Cheat Sheet

Reference guide for Pandas Series and DataFrames. Learn data loading, cleaning, selection, grouping, and aggregations.

Loading Data & Initial Inspection

Operations to load datasets and inspect their structural outlines.

MethodSyntaxDescription
pd.DataFrame()pd.DataFrame(data, columns=...)Constructs a two-dimensional labeled data structure with columns.
pd.read_csv()pd.read_csv(filepath)Loads a CSV file into a Pandas DataFrame.
head() / tail()df.head(n=5) / df.tail(n=5)Returns the first n or last n rows of the DataFrame.
info()df.info()Prints a concise summary of the DataFrame columns, non-null counts, and memory usage.
describe()df.describe()Generates descriptive statistics summarizing central tendency, dispersion, and shape.

Data Selection, Filtering & Cleaning

Methods to slice columns/rows, aggregate data, and resolve missing records.

MethodSyntaxDescription
loc[]df.loc[row_label, col_label]Accesses a group of rows and columns by labels.
iloc[]df.iloc[row_pos, col_pos]Accesses a group of rows and columns by integer index positions.
groupby()df.groupby(col).mean()Groups rows sharing col values and aggregates numeric fields.
value_counts()df[col].value_counts()Returns a Series containing counts of unique values in col.
dropna()df.dropna(axis=0)Removes rows (or columns) that contain missing or null values.
fillna()df.fillna(value)Fills missing values with a specified value or strategy.

Frequently Asked Questions

What is the difference between loc and iloc?

loc is label-based indexing (selects data by row/column names), whereas iloc is integer-based indexing (selects data by index position).

How do you handle missing values in Pandas?

Use dropna() to remove rows/columns with null values, or use fillna(value) to replace null values with a specific default value.

Keep Learning

Recommended Python Resources

Expand your knowledge with related interactive tutorials, cheat sheets, and code comparisons.