-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "run_id" k8s and elasticsearch compatibility with Airflow 2.1 #22385
Conversation
I need to re-release the providers due to bug in install_requires found (#22380 ) but before that, I would like to also get that one fixed (it was raised by one of the users in https://apache-airflow.slack.com/archives/CCV3FV9KL/p1647519576375459 |
Looks like it works :) |
080ff01
to
5ecb889
Compare
764c8f7
to
413cd7b
Compare
400c696
to
bcf7dab
Compare
The execution_date -> run_id change (apache#21960) attempted to make it Airflow 2.1 backwards-compatible, but the problem is that in Airflo2 2.1 retrieving `run_id` attribute of TaskInstance throws AttributeError rather than returns None. It turns out that when you have a field defined in an ORM model, it will never throw AtributeError (even if you delete the attribute it will return None. Accesising `run_id` with getattr raises AttributeError in Airflow 2.1 (because there TaskInstance has no run_id defined). This PR adds automated pre-commit to check if other providers have not suffered (and will not suffer) the same problem.
bcf7dab
to
89a991d
Compare
I also removed the unit test (it's not really needed now and it was pretty "convoluted" anyway). |
All green now. |
@@ -129,7 +129,7 @@ def _render_log_id(self, ti: TaskInstance, try_number: int) -> str: | |||
return self.log_id_template.format( | |||
dag_id=ti.dag_id, | |||
task_id=ti.task_id, | |||
run_id=ti.run_id, | |||
run_id=getattr(ti, "run_id", ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with an empty string as a default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only formatting hte log entry so I "guess" it produces proper log entry. But good point - I do not know if the es logs will be properly parsed by ES engine then (though having unparseable logs is still infintely better than crashing airflow in this case :D)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jedcunningham @uranusjr 🙏 - I'd love to re-release providers asap as the ones we have install gitpython and wheel due to my sloppines :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jedcunningham @uranusjr @kaxil -> I'd really love to merge it and re-release the providers soon. The "extra packages" in the last list need to go away and the cncf.kuberbnetes starts to be a problem for users of Airflow 2.1 (likely this one https://apache-airflow.slack.com/archives/CCV3FV9KL/p1647936437392429)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved the PR but I still don't know about this and haven't dug deep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well. It certainly won't be worse than crashing when you try to send an elasticsearch log :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what 2.1 users would experience if they run this provider version.
Anyone ? |
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
…ion_date As part of PR apache/airflow#22385 released in Airflow 2.3.0, TaskInstance now mandatorily needs to have either of run_id or execution_date. Add the same to the failing test to make it working.
The execution_date -> run_id change (#21960) attempted to make it
Airflow 2.1 backwards-compatible, but the problem is that in
Airflo2 2.1 retrieving
run_id
attribute of TaskInstance throwsAttributeError rather than returns None. It turns out that when
you have a field defined in an ORM model, it will never throw
AtributeError (even if you delete the attribute it will return
None.
Accesising
run_id
with getattr raisesAttributeError in Airflow 2.1 (because there TaskInstance has no
run_id defined).
This PR adds automated pre-commit to check if other providers
have not suffered (and will not suffer) the same problem.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.