ST
StateTrace
Visual Quant & Low-Latency Systems Lab
GitHub
Curriculum/polars-architecture

Polars Architecture

data engineering·L3 · system pattern·stub
Replacesthe API view of Polars (Stage 2's polars-lazy-frame).

The architecture view: Arrow buffers in `polars-arrow`, expression IR in `polars-plan`, query optimiser in `polars-plan/lp`, parallel executor in `polars-pipe`, streaming engine for out-of-core. The crates are independently usable; the API surface (`polars`) is the user-facing facade. Reading this codebase teaches more about query engine design than any textbook.

Unlocks
Bridges
  • duckdb-vs-polars-internalsshared measurement
    DuckDB and Polars solve the same problem (in-process columnar query engine) with different choices: DuckDB writes vectorised C++ with morsel-driven parallelism; Polars writes Rust with explicit task graphs. The benchmarks (h2oai/db-benchmark) put them within 2× of each other on most workloads — convergent design.
  • query-optimizer-architecturemodel to implementation
    Polars's optimiser is rule-based (heuristic transforms applied in order); DuckDB's is cost-based with statistics. Production engines (Snowflake, BigQuery) layer both. The trade: cost-based handles unusual queries better but requires accurate statistics; rule-based is deterministic and easier to debug.
Status

This concept is a node in the curriculum DAG. The full lab — page blocks, done state, references — has not been authored yet. The relations above describe where it sits in the graph.

Author at: content/concepts/polars-architecture/card.ts