Important: This package is under active development but is not yet ready for production.
The Kedro-Dagster plugin enables seamless integration between Kedro, a framework for creating reproducible and maintainable data science code, and Dagster, a data orchestrator for machine learning and data pipelines. This plugin makes use of Dagster's orchestration capabilities to automate and monitor Kedro pipelines effectively.
- Dataset Translation: Converts Kedro datasets into Dagster assets and IO managers, facilitating smooth data handling between the two frameworks.
- Pipeline Translation: Transforms Kedro pipelines into Dagster jobs, enabling their execution and scheduling.
- Configuration-Driven Execution and Automation: Utilizes Kedro's configuration to specify job executors and define schedules, allowing for flexible and dynamic pipeline management.
- Hook Support: Preserves Kedro hooks within the Dagster context, ensuring that custom behaviors and plugins are maintained during pipeline execution.
- Logger Integration: Integrates Kedro's logging with Dagster's logging system, providing unified and comprehensive logging across both platforms.
Install the Kedro-Dagster plugin using pip:
pip install kedro-dagster
-
Initialize the Plugin in Your Kedro Project:
Navigate to your Kedro project directory and install the plugin:
pip install kedro-dagster
-
Generate Dagster Definitions and Configuration:
Use the following command to generate a
definitions.py
file, where all translated Kedro objects are available as Dagster objects, and adagster.yml
configuration file:kedro dagster init --env <ENV_NAME>
-
Configure Jobs, Executors, and Schedules:
Define your job executors and schedules in the
dagster.yml
configuration file located in your Kedro project'sconf/<ENV_NAME>
directory. This file allows you to filter Kedro pipelines and assign specific executors and schedules to them.# conf/base/dagster.yml schedules: my_job_schedule: cron_schedule: "0 0 * * *" executors: my_executor: retries: 3 jobs: my_job: pipeline: pipeline_name: __default__ executor: my_executor schedule: my_job_schedule
-
Launch the Dagster UI:
Start the Dagster UI to monitor and manage your pipelines using the following command:
kedro dagster dev
The Kedro-Dagster documentation will be available soon, stay tuned!
Yes! We welcome all kinds of contributions. Check out our [guide to contributing](https://github.com/kedro-org/kedro/wiki/Contribute-to-Kedro.
There is a growing community around the Kedro project and we encourage you to become part of it. To ask and answer technical questions on the Kedro Slack and bookmark the Linen archive of past discussions. For questions related specifically to Kedro-Dagster, you can also open a discussion.
This project is licensed under the terms of the Apache 2.0 License.
This plugin is inspired by existing Kedro plugins such as the official Kedro plugins, kedro-kubeflow, kedro-mlflow.