A hands-on course that takes apart a real dataframe engine — pyfloe — and teaches you the architecture behind Polars, Spark, and every serious query engine.
This isn't a course you read — it's a course you build with. Every code example is editable and runnable. Change things. Break things. That's the point.
Start the coursepf.col("x") + pf.col("y") builds an inspectable tree instead of running a computation.
Core path: Introduction → Modules 1–5 (with the Interlude between 4 and 5). This is the complete arc.
Optional deep dives: The “Why” interlude can be read at any point. The Streaming deep dive assumes Module 3. The Epilogue assumes Module 5.
This course is built around pyfloe — a lazy dataframe library written entirely in pure Python. No C extensions, no Rust bindings, no dependencies. The same architectural patterns as the big libraries, in a codebase small enough to read in an afternoon.