Deployment support/Kedro deployer #2058

merelcht · 2022-11-23T15:54:35Z

Description

We want to offer users better support with deploying Kedro.

Production deployment is important to drive value; we're at least a couple years past when people were happy just running things locally
A large percentage of "hard" questions we get are related to deployment
- TODO: Consider quantifying this percentage and providing some examples
The development team is also suffering
- The existing deployment guides have a lot of challenges and are hard to maintain
- Tribal knowledge about the different platforms and integrations (a few people wrote the deployment guides and plugins years ago)
- Have to know too much (nobody can be an expert at deploying to so many platforms)

Implementation ideas

Short-term
- Support at least 1 "production" deployment archetype/stack out of the box
- Create a single set of best practices for deployment
  - There should be a clear mapping from "best-practice Kedro pipeline" to "best-practice Kedro deployment"
Medium-term
- Integrate with a universal deployment backend that opens the gateway to standardized deployment to more platforms (e.g. https://github.com/couler-proj/couler)

Questions

datajoely · 2022-11-24T10:30:53Z

So I love this and think we should really emphasise the power of modular pipelines here as a standout feature of Kedro. Choosing the granularity of what to translate is critical here.

In most cases a Kedro Node should not (necessarily) == a 'task' in an orchestrator.
I think our concept of a modular pipeline a.k.a the 'super node' in Viz is, in most cases, what should be translated to an deployable task.
The user should be able to define this level of granularity too, because they may want to decide on a case by case basis.

Couler looks v cool btw.

noklam · 2023-06-21T13:28:45Z

https://www.linen.dev/s/kedro/t/12647848/hi-kedro-community-slightly-smiling-face-i-have-a-question-r#45fa4774-5ebd-4826-9c10-47fd95d1bebd

Something that has been done on some plugins, we should come up with a flexible way that kedro core can create the mapping for different targets.

astrojuanlu · 2023-06-21T14:38:59Z

I guess #143 is tangentially related?

noklam · 2023-06-22T09:52:25Z

It is! Although I am thinking more about deployment for platform/orchestrator here, but there are definitely case user deploying kedro pipeline to an endpoint.

Usage of Kedro pipeline with web services & Deployment #1846 is the ticket to research and document these use cases, currently deployment on platform/orchestrator has a higher priority.

noklam · 2023-10-25T13:37:35Z

Not sure where should it go, so I just put here as this is amateur thought. Inspired by deepyaman, it seems that it's fairly easy to convert a Kedro pipeline to Metaflow. It's just a variable assignment between steps (node in Kedro's term).

The benefit of using Metaflow is that it allows you to abstract infrastructure with a decorator like @kubernetes(memory=64000) (obviously the tricky part is to get this cluster setup properly, but from DS perspective it doesn't matter as they simply want more compute for specific task). This can be integrated more smoothly potentially with kedro's tags to denote the required infra.

This is not saying Kedro is gonna to integrate with Metaflow, but showing that the possibility of doing it.

from metaflow import FlowSpec, step

class LinearFlow(FlowSpec):

    @step
    def start(self):
        self.my_var = 'hello world'
        self.next(self.a)

    @step
    def a(self):
        print('the data artifact is: %s' % self.my_var)
        self.next(self.end)

    @step
    def end(self):
        print('the data artifact is still: %s' % self.my_var)

if __name__ == '__main__':
    LinearFlow()

astrojuanlu · 2023-11-24T14:52:22Z

Related: #3094

That issue contains a research synthesis, we can use this issue to collect the plan.

astrojuanlu · 2024-05-16T13:49:48Z

Found in #770 from @idanov:

kedro deploy airflow

(found after listening to @ankatiyar's Tech Design on kedro-airflow and reading kedro-org/kedro-plugins#25 (comment))

astrojuanlu · 2024-05-21T09:19:57Z

As we learn more about how to deploy to Airflow (see @DimedS's #3860) it becomes more apparent that the list of steps can become quite involved. This is not new, as the Databricks workflows also require some care.

We have plenty of evidence that this is an important pain point that affects lots of users.

The main questions are:

Is there anything that can be done in general to make these processes easier, simpler, leaner? (So that we don't rush to automate something that should have been easier in the first place)
Is there a way these processes can be partly automated for specific platforms of choice? (Assuming that there's no general solution for all platforms, given the low traction of couler)
- And which platforms should we choose, and with which criteria?

noklam · 2024-05-21T11:40:42Z

We should really treat Kedro as a Pipeline DSL. Most of the orchestrator work as a DAG, so these are all generic feature. For example:

Grouped memory nodes in Airflow is not unique to Airflow

So there is definitely a theme around DAG manipulation, is the current Pipeline API flexible enough? We have to implement a separate dfs to support the grouped memory nodes feature for airflow. We can also extend the grouped memory node feature to i.e. running specific nodes on a GPU machine, spark cluster, different python envrionment, machine type.

Where will be the best place to hold these metadata? maybe tag?

There are some early vision in #770 and #3094 done by @datajoely last year.
It would also be really cool to have some kind of DryRunner where one can orchestrate multiple Kedro Session in one go, this allow one to catch error like "memory dataset" doesn't exist in a "every kedro node as a airflow node" situation.

datajoely · 2024-05-21T13:11:00Z

Yeah - my push here is not to focus too much on Airflow and really address the fundamental problems which make Kedro slightly awkward in these situations

astrojuanlu · 2024-06-12T08:11:56Z

Potentially interesting https://github.com/run-house/runhouse (via https://www.run.house/blog/lean-data-automation-a-principal-components-approach)

astrojuanlu · 2024-06-23T18:55:54Z

Possible inspiration: Apache Beam https://beam.apache.org/get-started/beam-overview/

Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Flink, Apache Spark, and Google Cloud Dataflow.

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

options = PipelineOptions([
    "--runner=PortableRunner",
    "--job_endpoint=localhost:8099",
    "--environment_type=LOOPBACK"
])
with beam.Pipeline(options) as p:
    ...

There's a difference though between deploying and running, this is more similar to our Dask runner https://docs.kedro.org/en/stable/deployment/dask.html

But maybe it means that we need a more clear mental model too. In my head, when we deploy a Kedro project, Kedro is no longer responsible for the execution, whereas writing a runner implies that Kedro is in fact driving the execution and acting as a poor-man's orchestrator.

datajoely · 2024-06-24T10:33:11Z

I guess there is a philosophical question (very much related to granualirity) on how you express which chunks of pipeline(s) get executed in different distributed systems.

The execution plan of a Kedro DAG is resolved at runtime, we do not have a standardised way of:

Declaring a group of nodes which must be run together
Saying what infrastructure they should run on
Validating if / ensuring data is passed between groups through persistent storage.

I'm not against the context manager approach for doing this in principle - but I think it speaks to the more fundamental problem of how some implicit elements of Kedro (which increase development velocity) leave a fair amount of unmanaged complexity when it gets to this point in the process.

noklam mentioned this issue Jun 21, 2023

Create docs for best practices in Kedro pipeline deployment #2712

Open

takikadiri mentioned this issue Sep 18, 2023

Allow injecting data into a KedroSession run #2169

Open

astrojuanlu mentioned this issue Apr 15, 2024

kedro-airflow: revise Airflow deployment manual kedro-org/kedro-plugins#605

Closed

astrojuanlu mentioned this issue May 21, 2024

Update Airflow AWS MWAA deployment docs #3860

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment support/Kedro deployer #2058

Deployment support/Kedro deployer #2058

merelcht commented Nov 23, 2022 •

edited by deepyaman

Loading

datajoely commented Nov 24, 2022

noklam commented Jun 21, 2023 •

edited by astrojuanlu

Loading

astrojuanlu commented Jun 21, 2023

noklam commented Jun 22, 2023

noklam commented Oct 25, 2023

astrojuanlu commented Nov 24, 2023

astrojuanlu commented May 16, 2024

astrojuanlu commented May 21, 2024

noklam commented May 21, 2024 •

edited

Loading

datajoely commented May 21, 2024

astrojuanlu commented Jun 12, 2024

astrojuanlu commented Jun 23, 2024

datajoely commented Jun 24, 2024

Deployment support/Kedro deployer #2058

Deployment support/Kedro deployer #2058

Comments

merelcht commented Nov 23, 2022 • edited by deepyaman Loading

Description

Implementation ideas

Questions

datajoely commented Nov 24, 2022

noklam commented Jun 21, 2023 • edited by astrojuanlu Loading

astrojuanlu commented Jun 21, 2023

noklam commented Jun 22, 2023

noklam commented Oct 25, 2023

astrojuanlu commented Nov 24, 2023

astrojuanlu commented May 16, 2024

astrojuanlu commented May 21, 2024

noklam commented May 21, 2024 • edited Loading

datajoely commented May 21, 2024

astrojuanlu commented Jun 12, 2024

astrojuanlu commented Jun 23, 2024

datajoely commented Jun 24, 2024

merelcht commented Nov 23, 2022 •

edited by deepyaman

Loading

noklam commented Jun 21, 2023 •

edited by astrojuanlu

Loading

noklam commented May 21, 2024 •

edited

Loading