
Real-world data is rarely in one single file. You might have sales data in one CSV and customer info in another. You need to combine them, and that’s where the concept of Pandas Merging DataFrames comes in handy.
Pandas gives you two main ways to do this: Concatenating (stacking) and Merging (joining).
1. Concatenating (pd.concat)
Think of this as stacking DataFrames on top of each other. It’s useful when you have two files with the same columns (e.g., “January Sales” and “February Sales”).
import pandas as pd
df1 = pd.DataFrame({'A': ['a1', 'a2'], 'B': ['b1', 'b2']})
df2 = pd.DataFrame({'A': ['a3', 'a4'], 'B': ['b3', 'b4']})
# Stack them vertically (axis=0 is default)
result = pd.concat([df1, df2])
print(result)
# Output will have 4 rows: a1, a2, a3, a42. Merging (pd.merge)
Think of this like a SQL JOIN. It connects data horizontally based on a shared “key” column (like a User ID).
users = pd.DataFrame({
'user_id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Charlie']
})
orders = pd.DataFrame({
'order_id': [101, 102],
'user_id': [1, 2], # Charlie (user 3) has no orders
'amount': [50, 100]
})
# Merge them on the 'user_id' column
# 'inner' join keeps ONLY rows that match in BOTH frames
merged_df = pd.merge(users, orders, on='user_id', how='inner')
print(merged_df)
# Result: Alice and Bob only. Charlie is dropped because he didn't match.Try changing how='inner' to how='outer' to keep Charlie!
Key Takeaways
- Real-world data often exists in multiple files, requiring integration.
- Pandas offers two main methods for combining data: Concatenating and Merging.
- Concatenating (pd.concat) stacks DataFrames with the same columns vertically.
- Merging (pd.merge) connects data horizontally using a shared key column, similar to a SQL JOIN.
- Adjusting the merge type changes the output, such as using ‘outer’ to include all data.





