Pandas is a powerful library in Python for data manipulation and analysis. It provides versatile tools to handle and analyze data efficiently. This cheatsheet serves as a quick reference for commonly used Pandas commands, categorized by their functionality to help you navigate through your data processing tasks with ease.
Loading Data in Pandas
These commands are used to load various types of data into a Pandas DataFrame, enabling further analysis and manipulation.
| Pandas Function | Description |
|---|---|
pd.read_csv('file.csv') |
Load data from a CSV file |
pd.read_json('file.json') |
Load data from a JSON file |
pd.read_excel('file.xlsx') |
Load data from an Excel file |
pd.read_sql(query, connection) |
Load data from a SQL query |
pd.read_html('file.html') |
Load data from an HTML table |
Data Exploration
These commands provide a quick overview of the dataset, including its structure, summary statistics, and general information.
| Pandas Function | Description |
|---|---|
df.head() |
Display the first 5 rows |
df.info() |
Show data types and information |
df.describe() |
Get summary statistics |
df.shape |
Get the dimensions of the DataFrame |
Accessing Data
These commands are useful for selecting specific columns or rows from the dataset based on labels or positions.
| Pandas Function | Description |
|---|---|
| df[‘column’] | Select a single column |
| df[[‘col1’, ‘col2’]] | Select multiple columns |
| df.loc[row_label] | Select rows by label |
| df.iloc[row_index] | Select rows by position |
Leave a comment