Conditional Logic in Polars: The when/then API (SQL Case When)

3D visualization of trains being diverted to different tracks based on color, representing Polars when/then conditional logic.

One of the most common data tasks is creating a new column based on a condition. In this tutorial, we’ll focus on using Polars when then expressions to achieve this.

  • If salary > 100000, set tier to “High”.
  • If salary > 50000, set tier to “Mid”.
  • Otherwise, set tier to “Low”.

In SQL, you use CASE WHEN. In Polars, you use the blazingly fast pl.when().then().otherwise() expression.

The Syntax

The basic chain is pl.when(CONDITION).then(VALUE).

import polars as pl
df = pl.DataFrame({"sales": [50, 200, 150, 40]})

# Create a new column "performance"
df_new = df.with_columns(
    pl.when(pl.col("sales") > 100)
      .then(pl.lit("Good"))
      .otherwise(pl.lit("Bad"))
      .alias("performance")
)
print(df_new)

Output:

shape: (4, 2)
┌───────┬─────────────┐
│ sales ┆ performance │
│ ---   ┆ ---         │
│ i64   ┆ str         │
╞═══════╪═════════════╡
│ 50    ┆ Bad         │
│ 200   ┆ Good        │
│ 150   ┆ Good        │
│ 40    ┆ Bad         │
└───────┴─────────────┘

Chaining Multiple Conditions

You can chain multiple when/then calls, just like if/elif/else.

df_new = df.with_columns(
    pl.when(pl.col("sales") > 150)
      .then(pl.lit("Excellent"))
    .when(pl.col("sales") > 75)
      .then(pl.lit("Good"))
    .otherwise(pl.lit("Needs Improvement"))
      .alias("performance")
)
print(df_new)

Output:

shape: (4, 2)
┌───────┬───────────────────┐
│ sales ┆ performance       │
│ ---   ┆ ---               │
│ i64   ┆ str               │
╞═══════╪═══════════════════╡
│ 50    ┆ Needs Improvement │
│ 200   ┆ Excellent         │
│ 150   ┆ Good              │
│ 40    ┆ Needs Improvement │
└───────┴───────────────────┘

Key Takeaways

  • Creating a new column based on conditions is a common task in data processing.
  • Use Polars when then expressions instead of SQL’s CASE WHEN for performance.
  • Set the ‘tier’ column based on salary: ‘High’ for over 100,000, ‘Mid’ for over 50,000, and ‘Low’ otherwise.
  • The basic syntax for Polars is pl.when(CONDITION).then(VALUE).
  • You can chain multiple when/then calls similar to if/elif/else statements.

Similar Posts

Leave a Reply