
Just like Pandas has NaN, Polars has null to represent missing or empty data. Before you can analyze a dataset, you must have a strategy for dealing with these null values.
Step 1: Finding and Counting Nulls
First, let’s see how big the Polars missing data problem is.
import polars as pl
df = pl.DataFrame({
"A": [1, 2, None, 4, 5],
"B": [None, "x", "y", "z", None],
"C": [100, 200, 300, 400, 500]
})
# Count nulls in every column
print(df.null_count())Output:
shape: (1, 3) ┌─────┬─────┬─────┐ │ A ┆ B ┆ C │ │ --- ┆ --- ┆ --- │ │ u32 ┆ u32 ┆ u32 │ ╞═════╪═════╪═════╡ │ 1 ┆ 2 ┆ 0 │ └─────┴─────┴─────┘
Step 2: Option A – Dropping Nulls
If a row is useless without the data, just drop it.
# Drop any row that contains at least one null value df_dropped = df.drop_nulls() print(df_dropped)
Output:
shape: (2, 3) ┌─────┬─────┬─────┐ │ A ┆ B ┆ C │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════╪═════╡ │ 2 ┆ x ┆ 200 │ │ 4 ┆ z ┆ 400 │ └─────┴─────┴─────┘
Step 3: Option B – Filling Nulls
Dropping data is often too extreme. It’s usually better to fill the missing values.
Polars uses the Expression API for this, making it fast and powerful.
# 1. Fill with a static value (e.g., 0) df_filled_zero = df.fill_null(0) print(df_filled_zero) # 2. Fill with a "strategy" (e.g., the average) df_filled_mean = df.fill_null(strategy="mean") print(df_filled_mean) # 3. Fill "forward" (use the last valid value) df_filled_forward = df.fill_null(strategy="forward") print(df_filled_forward)
Advanced: Chaining (The Polars Way)
You can combine this with with_columns for more control.
# Fill column A with 0, but fill column B with the word "UNKNOWN"
df_selective_fill = df.with_columns([
pl.col("A").fill_null(0),
pl.col("B").fill_null(pl.lit("UNKNOWN"))
])
print(df_selective_fill)Key Takeaways
- Polars uses null to signify missing or empty data, similar to how Pandas uses NaN.
- The first step in handling Polars missing data involves finding and counting nulls in your dataset.
- You can either drop rows with nulls if they’re unusable, or fill in missing values instead.
- Filling nulls is often preferred, and Polars enables this via its fast and powerful Expression API.
- For advanced control, you can chain operations using with_columns in Polars.





