Python API Flowfile Reference
This section documents Flowfile's Python API, focusing on extensions and differences from Polars. For standard Polars operations, see the Polars documentation.
Core API
Data Input/Output
- Reading Data - File formats and cloud storage
- Writing Data - Saving results
- Data Types - Supported data types
Transformations
- FlowFrame Operations - Filter, select, sort
- Aggregations - Group by and summarize
- Joins - Combining datasets
Flowfile-Specific Features
- Cloud Storage - S3 integration
- visualize pipelines - Working with the visual editor
Key Extensions to Polars
Description Parameter
Every operation accepts description
for visual documentation:
df = df.filter(ff.col("active") == True, description="Keep active records")
Flowfile Formula Syntax
Alternative bracket-based syntax for expressions:
df.filter(flowfile_formula="[price] > 100 AND [quantity] >= 10")
Automatic Node Types
Operations map to UI nodes when possible, otherwise fall back to polars_code
:
# Simple → UI node
df.group_by("category").agg(ff.col("value").sum())
# Complex → polars_code node
df.group_by([ff.col("category").str.to_uppercase()]).agg(ff.col("value").sum())
Graph Access
Inspect and visualize the pipeline DAG:
ff.open_graph_in_editor(df.flow_graph)
Architecture Deep Dives
For understanding how Flowfile works internally:
- Core Architecture - FlowGraph, FlowNode, and FlowDataEngine internals
- Design Philosophy - The dual interface approach
Getting Help
- Not finding a method? Check the Polars documentation - most methods work identically
- Need examples? See our tutorials
- Understanding concepts? Read about FlowFrame and FlowGraph
This reference covers Flowfile-specific features. For standard Polars operations, see the Polars API Reference.