-
Notifications
You must be signed in to change notification settings - Fork 177
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Following the discussion from #816, there would be benefits to allow materializer nodes to be defined statically at the Driver level (both DataLoader and DataSaver).
- The nodes can be called directly via
.execute() - Materializers appear in
HamiltonGraphand visualizations even if they aren't executed. - Validate the DAG, including the materializers before execution.
Solution 1
dr = (
driver.Builder()
.with_modules(...)
.with_materializers(
to.dlt(
id="features_duckdb",
dependencies=["features_df"]m
destination=duckdb_dest(...),
)
)
.build()
)Solution 2
An alternative, would be to allow materializers to be imported and added via .with_modules(). For example, production_materializers.py contains
# production_materializers.py
from hamilton.io.materialization import to
to.dlt(
id="features__duckdb",
dependencies=["features_df"],
destination=duckdb_dest(...),
)from hamilton import driver
import dataflow
import production_materializers
dr = driver.Builder().with_modules(dataflow, production_materializers).build()For basic to.parquet() usage, it might be more efficient to store simple Python functions using pd.to_parquet() in a module to enable this patterns. More powerful materializers (e.g., dlt) would benefit from this approach though
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request