-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an additional Jinja2-Template #27
Comments
Hi @fdroessler, sorry for the late reply. I see there are a couple of issues mentioned here.
I think this could be useful, but I am not sure what's the best practice with the latest Airflow. For example, I found that they release a |
Hi @noklam apologies from my side as well ;) However, is there any argument against providing templates using different Airflow "APIs"? From what I have seen, while TaskFlowAPI is probably the way-to-go it is not necessarily common yet. Also I have seen a variety of pre-TaskFlowAPI deployments which would not be able to support that workflow but would be able to support |
@fdroessler I think for this we really need the input from the community, I am definitely not an Airflow expert and I don't understand the advantage of all the APIs. As long as we state clearly what are the use cases/advantages of the different operators, I think it's reasonable to have more API Operator supported. Does the The one thing that I think we need more consideration is that |
Description
The current
kedro-airflow
plugin requires the kedro project to be installed on the Airflow worker. This can provide a challenge when changes to the kedro projects were done and need to be rolled out to the airflow worker. An alternative to the current extension of the BaseOperator which generates a KedroOperator would be to use the PythonVirtualenvOperator. In that alternative setup, the worker would be able to install the kedro project updates from a local/official pypi server upon release of an update. The downside is that there is an additional overhead when creating the Virtualenv.Context
I have run into issues when using the current setup as I am getting OOM errors when running kedro dags that use the KedroOperator (investigation to why is still ongoing). Further to this and probably due to my inexperience I was struggling rolling out updates from the kedro project to the airflow workers without manually updating the worker or without what would seem a lot of hassle in worker updates.
Possible Implementation
An alternative Jinja2 Template can be provided which follows very much the logic of the existing one but replaces the KedroOperator with a PythonVirtualenvOperator setup.
Basic elements of the setup can be found below:
Kedro function to be executed in the virtual environment:
Changes to the task generating loop:
Would be happy to receive feedback on this and if it is deemed useful I am happy to create a PR.
The text was updated successfully, but these errors were encountered: