Add Kedro comparison section with working examples#1282
Open
yarikoptic wants to merge 2 commits intodatalad-handbook:mainfrom
Open
Add Kedro comparison section with working examples#1282yarikoptic wants to merge 2 commits intodatalad-handbook:mainfrom
yarikoptic wants to merge 2 commits intodatalad-handbook:mainfrom
Conversation
Add comprehensive comparison between DataLad/YODA and Kedro, a popular Python framework for data engineering pipelines. The section covers: - Philosophy and focus differences - Project setup comparison - Data versioning approaches (Kedro timestamp-based vs git-annex) - Modularity patterns (modular pipelines vs subdatasets) - Pipeline execution and provenance tracking - Configuration management - When to use which tool - How to use them together (walkthrough example) Key improvements applied to make examples work with current versions: - Update kedro_init_version from 0.19.0 to 1.2.0 (required for compatibility) - Add required catalog.yml file for Kedro 1.x - Enhance demo pipeline to write output.txt for visible provenance tracking - Add .gitignore for Python cache files (__pycache__/, *.pyc) - Include KEDRO_DISABLE_TELEMETRY option for cleaner output - Add test script (kedro-examples-test.sh) validating all examples All examples tested and working with datalad 1.3.1 and kedro 1.2.0. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
yarikoptic
commented
Feb 10, 2026
| .. code-block:: console | ||
|
|
||
| ### Optional: Disable telemetry for cleaner output | ||
| $ export KEDRO_DISABLE_TELEMETRY=true |
Contributor
Author
There was a problem hiding this comment.
analysis/comparison by copilot: https://github.com/con/duct/pull/396/changes#diff-8c10ce32dc94f2db66216789f910187e6547904879a33626447b97e45fb32434
- Replace manual file creation with `kedro new` in the integration walkthrough, per Kedro team recommendation - Add admonition clarifying DataLad handles versioning (don't use Kedro's `versioned: true` alongside it) - Reorder steps: kedro new -> datalad create --force -> pipeline -> run - Drop standalone Kedro test (TEST 4) that only tested manual setup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rendered version shortcut: https://datalad-handbook--1282.org.readthedocs.build/en/1282/beyond_basics/101-185-kedro.html
Add comprehensive comparison between DataLad/YODA and Kedro, a popular Python framework for data engineering pipelines. The section covers:
Key improvements applied to make examples work with current versions:
__pycache__/,*.pyc)All examples tested and working with datalad 1.3.1 and kedro 1.2.0.
primarily was born from me keep running into kedro and thus wanted to see a comparison similar to the one we have to DVC. Could potentially be cut (e.g. trailing section) or abandoned altogether