
In Pandas, you use pd.merge() to combine datasets. In Polars, you use the join() method, which is one of the fastest in any library. If you want to learn specifically about how to use Polars join DataFrames functionality, this guide will explain the essentials.
Just like a SQL JOIN, it lets you combine two tables based on a shared “key” column.
The Setup
Let’s create two DataFrames: one for users and one for their orders.
import polars as pl
users = pl.DataFrame({
"user_id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
})
orders = pl.DataFrame({
"user_id": [1, 1, 2],
"product": ["Keyboard", "Mouse", "Monitor"],
})1. The “Inner” Join (Default)
An inner join finds only the rows that have a match in both DataFrames.
# Join 'users' (left) with 'orders' (right) on the "user_id" column joined_df = users.join(orders, on="user_id") print(joined_df)
Output:
shape: (3, 3) ┌─────────┬───────┬──────────┐ │ user_id ┆ name ┆ product │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ str │ ╞═════════╪═══════╪══════════╡ │ 1 ┆ Alice ┆ Keyboard │ │ 1 ┆ Alice ┆ Mouse │ │ 2 ┆ Bob ┆ Monitor │ └─────────┴───────┴──────────┘
Notice Charlie (user_id 3) is gone because he had no orders.
2. The “Left” Join
A left join keeps everything from the left DataFrame (users) and only brings in matches from the right (orders).
left_join_df = users.join(orders, on="user_id", how="left") print(left_join_df)
Output:
shape: (4, 3) ┌─────────┬─────────┬──────────┐ │ user_id ┆ name ┆ product │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ str │ ╞═════════╪═════════╪══════════╡ │ 1 ┆ Alice ┆ Keyboard │ │ 1 ┆ Alice ┆ Mouse │ │ 2 ┆ Bob ┆ Monitor │ │ 3 ┆ Charlie ┆ null │ └─────────┴─────────┴──────────┘
Now Charlie is included, but his product is null (Polars’ version of NaN).





