polars

A data manipulation library, implemented in Rust. It is a faster and more memory efficient alternative to pandas, as it is able to take advantage of all CPU cores via rayon to distribute the workload.

It has exposes the dataframe API to a lesser extent also supports SQL.

I’ve seen two orders of magnitude speedup with this tool (granted on a poorly implemented baseline).

API Flavours

Polars comes with a few flavours of API:

An API that mimics pandas more closely. This allows for a quick import statement swap
An expression API that is closer to Spark¹. This allows for independent composition of transformation logic
This API has two flavours, eager execution and lazy execution. The former produces a polars.DataFrame while the other produces a polars.LazyFrame
A SQL API

It is highly recommended to use the lazy API as that allows for plenty of optimizations including caching, predicate pushdown and projection pushdown. See here for list of all optimizations.

Warning

Unlike pandas, polars intentional does not support the concept of indices as they believe a query’s semantics should not be affected by the state of an index.

API Flavour	Eager/Lazy	Optimizations	Streaming
`pandas`-like	Eager	Some	❌
Spark-like (aka expression-based); eager	Eager	Similar to above(?)	❌
Spark-like (aka expression-based); lazy	Lazy	A lot	✅

todo Improve table

Streaming Support

With the lazy API also comes the a streaming functionality. This allows you to operate on out-of-core workloads. This can be done simply be parametrizing .collect(streaming=True).

duckdb also supports this API to a lesser extent ↩

🪴 Chris' Digital Garden

Recent Notes

Arithmetic Intensity of a Neural Network Linear Layer

Automatic Material System

Explorer

polars

API Flavours

Streaming Support

Graph View

Table of Contents

Backlinks

🪴 Chris' Digital Garden

Recent Notes

Arithmetic Intensity of a Neural Network Linear Layer

Automatic Material System

Explorer

polars

API Flavours

Streaming Support

Footnotes

Graph View

Table of Contents

Backlinks