You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks a lot for reporting this issue, @StephenOTT ! Would you be willing to contribute a fix? We should add an example DAG that leverages this so we can avoid regressions, once we fix this.
I implemented a solution to solve the problem, which seems to have an acceptable asymptotic complexity - O(N + D) where N is the total amount of tasks and D is the total amount of dependencies: #307
The issue described affected any YAML files that declared upstream tasks after downstream tasks, regardless of using dynamic task mapping.
Any YAML files that declare upstream tasks after downstream tasks,
regardless of using dynamic task mapping, would fail.
Example of DAG that would fail:
```
test_expand:
default_args:
owner: "custom_owner"
start_date: 2 days
description: "test expand"
schedule_interval: "0 3 * * *"
default_view: "graph"
tasks:
process:
operator: airflow.operators.python_operator.PythonOperator
python_callable_name: expand_task
python_callable_file: $CONFIG_ROOT_DIR/expand_tasks.py
partial:
op_kwargs:
test_id: "test"
expand:
op_args:
request.output
dependencies: [request]
request:
operator: airflow.operators.python.PythonOperator
python_callable_name: example_task_mapping
python_callable_file: $CONFIG_ROOT_DIR/expand_tasks.py
```
In this example, the upstream (parent) task "request" is defined after
the downstream (child) task "process". Before this change, this DAG
would fail.
I implemented a solution to solve the problem that uses Kahn's algorithm
to sort the tasks topologically:
https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm
It has asymptotic complexity O(N + D), where N is the total number of
tasks, and D is the total number of dependencies. This complexity seems
acceptable.
An alternative to the current approach would be to create all the tasks
without dependencies as a starting point and add the dependencies once
all tasks were made - similar to what we did in
https://github.com/astronomer/astronomer-cosmos. However, this approach
would require a bigger refactor of the DAG factory and may have issues
with dynamic task mapping.
Closes: #225
If you have something like
Then when dagbuilder parses the tank/duct is does not properly change the expand arg to a referenced Xcom and it remains as a are
The text was updated successfully, but these errors were encountered: