-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DAGS folder being in different location on scheduler and runners #16860
Support DAGS folder being in different location on scheduler and runners #16860
Conversation
mark_success=False, | ||
ignore_all_deps=False, | ||
ignore_depends_on_past=False, | ||
ignore_task_deps=False, | ||
ignore_ti_state=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the defaults, we don't need to pass them again
airflow/models/dag.py
Outdated
@@ -2379,6 +2413,8 @@ class DagModel(Base): | |||
Index('idx_next_dagrun_create_after', next_dagrun_create_after, unique=False), | |||
) | |||
|
|||
parent_dag = relationship("DagModel", remote_side=[dag_id]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relationship was added for "Duck-type compatibility" between a DAG and a DagModel in the case of subdags -- both now have a "parent_dag" attribute.
|
||
@property | ||
def relative_fileloc(self) -> str: | ||
"""File location of the importable dag 'file' relative to the configured DAGs folder.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this field behave for example DAGs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it would be /usr/lib/python3.7/.../airflow/example_dags/example_bash_operator.py
-- so not relative at all.
I'll update/expand the docstring to say how it deals with DAGs outside of the dags folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we are also able to handle examples DAGs by adding a suffix? This will allow us to install workers on shared machines without any problems
For samples:
[exaample_dags_folder]/example_bash_operator.py
=> /usr/lib/python3.7/.../airflow/example_dags/example_bash_operator.py
[dags_folder]/example_bash_operator.py
=> ~/home/airflow/dags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change would be better done along with something like https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-20+DAG+manifest, so I've not done this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we shouldn’t call this relative fileloc, but just fileloc…? Not sure about this.
a72eb83
to
6a7d7ab
Compare
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
'template_searchpath', | ||
'last_loaded', | ||
} | ||
|
||
__serialized_fields: Optional[FrozenSet[str]] = None | ||
|
||
fileloc: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me I was wondering whether fileloc
is guaranteed to be absolute or not and had to trace a lot of code. Maybe it’s worthwhile to add this to the docstring (and the one on DagModel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good to me in general
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@ashb - can you rebase please ? I think most of the issues with CI stability are solved |
be08c89
to
af12ddd
Compare
There has been some vestigial support for this concept in Airflow for a while (all the CLI command already turn the literal `DAGS_FOLDER` in to the real value of the DAGS folder when loading dags), but sometime around 1.10.1-1.10.3 it got fully broken and the scheduler only ever passed full paths to DAG files. This PR brings back this behaviour
ca640d3
to
0da2b8c
Compare
Looks like transient MSQQL errors :(
|
Linking #16025 (comment) |
There has been some vestigial support for this concept in Airflow for a
while (all the CLI command already turn the literal
DAGS_FOLDER
in tothe real value of the DAGS folder when loading dags), but sometime
around 1.10.1-1.10.3 it got fully broken and the scheduler only ever
passed full paths to DAG files.
This PR brings back this behaviour
Closes #8061
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.