Building code with flows
Flowfile's Code Generator allows you to export your visually designed data pipelines as clean, executable Python code. This feature is designed to empower users who wish to inspect the underlying data transformation logic, integrate Flowfile pipelines into existing Python projects, or extend their workflows with custom scripts.
For pure transformation flows (filter, join, group by, etc.), the generated code is standalone Polars — no Flowfile dependency required. Flows that include I/O nodes (database, catalog, cloud storage, Kafka) generate code that uses import flowfile as ff for connection-aware operations. The transformation logic remains Polars in both cases.

Key Characteristics of the Generated Code
When you generate code from your Flowfile graph, you can expect the output to be:
- Mostly Standalone: Transformation logic uses only Polars. I/O operations (database, catalog, cloud storage, Kafka) use
flowfilefor managed connections. - Readable: The structure mirrors your visual flow, making it easy to understand the sequence of operations.
- Direct Translation: Transformation nodes translate to Polars operations. I/O nodes translate to FlowFrame API calls (
ff.read_database(),ff.read_catalog_table(), etc.). - Ready for Integration: Pure transformation flows can be embedded anywhere. Flows with I/O nodes require
pip install Flowfile.
Examples of Generated Code
Here are some simplified examples illustrating what the generated Polars code looks like for common Flowfile operations. These examples highlight how your visual workflow seamlessly translates into Python.
Example 1: Reading a CSV and Selecting Columns
This example shows how a pipeline that reads a CSV file and then selects/renames specific columns translates into Polars code.
Flowfile Pipeline:
- Read CSV (e.g.,
customers.csv) - Select (e.g., keep
nameascustomer_name,age)
Generated Polars Code
# Example 1: Reading a CSV and Selecting Columns
import polars as pl
def run_etl_pipeline():
"""
ETL Pipeline: Example CSV Read and Select
Generated from Flowfile
"""
df_1 = pl.scan_csv("/path/to/your/customers.csv")
df_2 = df_1.select(
pl.col("name").alias("customer_name"),
pl.col("age")
)
return df_2
if __name__ == "__main__":
pipeline_output = run_etl_pipeline()
Example 2: Grouping and Aggregating Data
This example demonstrates the code generated for a pipeline that processes a dataset and performs a group by operation with aggregations.
Flowfile Pipeline:
- Manual Input (sample sales data with
productandrevenue) - Group By (e.g., group by
product, sumrevenueastotal_revenue)
Generated Polars Code
# Example 2: Grouping and Aggregating Data
import polars as pl
def run_etl_pipeline():
"""
ETL Pipeline: Example Grouping and Aggregating
Generated from Flowfile
"""
# Simplified manual input example
df_1 = pl.LazyFrame(
{
"product": ["A", "B", "A", "B", "C"],
"revenue": [100.0, 200.0, 100.0, 200.0, 150.0],
}
)
df_2 = df_1.group_by(["product"]).agg([
pl.col("revenue").sum().alias("total_revenue"),
])
return df_2
if __name__ == "__main__":
pipeline_output = run_etl_pipeline()
Example 3: Custom Polars Code Execution
For advanced users, Flowfile offers a "Polars Code" node where you can write custom Polars expressions. Here's how that custom code is integrated into the generated script.
Flowfile Pipeline:
- Manual Input (a basic DataFrame)
- Polars Code (a node containing custom Polars logic, e.g., adding a new column)
Generated Polars Code
# Example 3: Custom Polars Code Execution
import polars as pl
def run_etl_pipeline():
"""
ETL Pipeline: Custom Polars Code Example
Generated from Flowfile
"""
df_1 = pl.LazyFrame({"value": [1, 2, 3]})
# Custom Polars code as defined in the Flowfile node
def _custom_code_node_name(input_df: pl.LazyFrame):
return input_df.with_columns((pl.col('value') * 10).alias('scaled_value'))
df_2 = _custom_code_node_name(df_1)
return df_2
if __name__ == "__main__":
pipeline_output = run_etl_pipeline()
Example 4: Reading from the Catalog
When a flow includes I/O nodes like a catalog reader, the generated code imports flowfile and uses the FlowFrame API for connection-aware operations.
Flowfile Pipeline:
- Catalog Reader (e.g.,
sales_datatable from namespace 3) - Filter (e.g., keep rows where
amount > 100)
Generated Code
# Example 4: Catalog Read with Filter
import flowfile as ff
import polars as pl
def run_etl_pipeline():
"""
ETL Pipeline: Catalog Read and Filter
Generated from Flowfile
"""
df_1 = ff.read_catalog_table("sales_data", namespace_id=3).data
df_2 = df_1.filter(pl.col("amount") > 100)
return df_2
if __name__ == "__main__":
pipeline_output = run_etl_pipeline()
.data accessor
The generated code calls .data on FlowFrame results to extract the underlying Polars LazyFrame. This keeps the rest of the pipeline as standard Polars operations.
These examples provide a clear overview of the type of high-quality, executable Python code produced by Flowfile's Code Generator.
Future Direction
In a future release, generated code will use the FlowFrame API directly for all operations, enabling round-trip editing between code and canvas.