Advanced Pandas: Mastering groupby() and Pivot Tables

ByAhmed Nabil February 18, 2026February 2, 2026

3D visualization of a machine sorting and arranging colored blocks into groups and grids, representing Pandas groupby and pivot tables.

Loading data is easy. Summarizing it is where the value lies, and that’s where Pandas groupby can make a big difference. If you have a sales dataset, you don’t want to see every individual sale; you want to see “Total Sales Per Month” or “Average Sales Per Product.”

1. The `groupby()` Method

This is the most important tool for summarization. It follows the “Split-Apply-Combine” pattern:

Split the data into groups (e.g., by “Category”).
Apply a function to each group (e.g., sum, mean, count).
Combine the results back together.

import pandas as pd
data = {
    'Product': ['A', 'B', 'A', 'B', 'C'],
    'Sales': [100, 200, 150, 250, 300],
    'Region': ['East', 'East', 'West', 'West', 'East']
}
df = pd.DataFrame(data)

# Group by 'Product' and sum their 'Sales'
print(df.groupby('Product')['Sales'].sum())
# Output:
# Product
# A    250
# B    450
# C    300

2. Pivot Tables (`pivot_table`)

If you come from Excel, you know Pivot Tables. Pandas can do them too, and they are even more powerful.

# Create a table with Products as rows, Regions as columns, showing average sales
pivot = df.pivot_table(values='Sales', index='Product', columns='Region', aggfunc='mean')
print(pivot)

This instantly creates a readable grid showing exactly how each product performs in each region.

Ahmed Nabil

Python Engineer and the founder of Python Pro Hub. With a focus on modern data science (Polars), backend architecture (FastAPI/Django), and automation, builds production-grade tutorials designed to take developers from absolute beginners to advanced software engineers.

Data Science
How to Find the Best Models on the Hugging Face Hub (A 2026 Guide)
ByAhmed Nabil May 18, 2026April 22, 2026
We’ve used the Hugging Face pipeline many times. But how do you know which model name to use? How do you find a model to…
Read More How to Find the Best Models on the Hugging Face Hub (A 2026 Guide)
Data Science | Python Errors
How to Fix: pandas.errors.ParserError: Error tokenizing data
ByAhmed Nabil July 15, 2026June 8, 2026
This pandas ParserError is the most common error when reading “messy” CSV files found in the wild. which signals there’s a problem parsing your data….
Read More How to Fix: pandas.errors.ParserError: Error tokenizing data
Data Science | Web Development
PyScript Project: Load and Analyze a User’s CSV File in the Browser
ByAhmed Nabil April 8, 2026March 21, 2026
This is the ultimate goal of PyScript for Data Science: building a tool that lets your users analyze their own data, all inside their browser….
Read More PyScript Project: Load and Analyze a User’s CSV File in the Browser
Data Science
Data Engineering with Polars: Performing Upserts (Merge) into Delta Tables
ByAhmed Nabil July 8, 2026May 31, 2026
In Data Engineering, you rarely just “write” files. You usually have a master dataset (e.g., “All Users”), and every day you get a “Daily Update”…
Read More Data Engineering with Polars: Performing Upserts (Merge) into Delta Tables
Data Science
High-Performance NLP: Pre-processing Text with Polars (2026 Guide)
ByAhmed Nabil April 25, 2026April 14, 2026
When preparing text data for an AI model, you’re often working with millions of rows. For this reason, many practitioners are interested in Polars NLP…
Read More High-Performance NLP: Pre-processing Text with Polars (2026 Guide)
Data Science | Python Projects
AI Project: Efficient Fine-Tuning with LoRA and PEFT (Train LLMs on Consumer Hardware)
ByAhmed Nabil June 22, 2026May 5, 2026
In Fine-Tuning (Part 3: Evaluation & Sharing), we fine-tuned a small BERT model. But if you try to fine-tune a modern LLM (like Llama 3…
Read More AI Project: Efficient Fine-Tuning with LoRA and PEFT (Train LLMs on Consumer Hardware)

Leave a Reply Cancel reply

You must be logged in to post a comment.