-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert DagFileProcessor.execute_callbacks to Internal API #28900
Conversation
…e_context is not yet converted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general approach looks good.
beed957
to
2b9014b
Compare
airflow/models/taskinstance.py
Outdated
""" | ||
Stop non-teardown tasks in dag. | ||
|
||
:meta private: | ||
""" | ||
tis = self.dag_run.get_task_instances(session=session) | ||
assert task_instance.dag_run is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only use asserts in in special circumstances (TYPE_CHECKING for example) - I suggest to change it to
if not task_instance_.dag_run:
raise ValueError("task_instance must have dag_run set")
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trusting that the code extracted to methods has been unchanged (hard to check).
Congrats on getting this one merged @vincbeck, it was a heavy lift! 🥳 |
@provide_session | ||
def handle_callback(self, dagrun, success=True, reason=None, session=NEW_SESSION): | ||
def fetch_callback( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincbeck @potiuk @mhenc @uranusjr just curious what your thoughts are on backward compatibility on this one. technically it (handle_callback
) was part of a public class and not marked :meta private:
in the docstring so.... technically it was probably public and therefore subject to backcompat.
that said, considering it as part of the public API also seems absurd. do you think we should put in our "public API" some "cover your ass" type of language that sort of expresses that .... methods which are clearly not for public use, even if not marked internal, are internal. maybe we could add some language that explains what that means. like methods not related to the dag authoring interface etc -- not sure.
but in any case, we should mark fetch_callback
as private by either prefixing with underscore or adding :meta private:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks backward compatible to me, Github makes it confusing but handle_callback
is not removed, look at line 1423
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it's not "disappearing" now so not an issue.
But I also would like to have a bit of philosophical rant here.
I thnk we are approaching backwards compatibility here as "0 / 1". And I keep on repeating it's totally wrong. Hyrum's right is totallly right here: https://www.hyrumslaw.com/ - in sufficiantly complex system ANY change is braaking. You would have to stop changing things to stop things from breaking. There is no way around it. And we cannot describe super precise rules about it upfront. We cannot really say:
- any time we change this and that, we are technically breaking things
That would make us slow and very shortly we would lose any flexibliity. I think we will never achieve 100% correctness and sets of rules that will clearly say for each change "breaking/not breaking" in automated way. We can approach it and get closer to it by adding more and more rules and description - but we will at most asymptotically get closer to the certainty - never achieving it and at some point in time, adding more and more rules will make it more not less confusing and contradicting. So we should strive for 'good enough" and "pretty correct" set of rules - but also accept the fact that there will be exceptions and room for interpretation and even for arbitrary decisions that others (including some of our users) might not agree with.
As I see it (and what I think SemVer also explains) Backwards Compatiility ad Semver is NOT about following certain rules "adding a parameter is breaking, renamig any method no marked as private is breaking". IMHO this is about three things:
-
what is the INTENTION we had when we created the code - were we INTENDING to make it relied on? Was described and explained that users were supposed to rely on it ? Or was their reliance on certain methods and fields accidental and the fact that method was there was just "assumed" they can rely on it?
-
How likely it is tha many of our users made such assumptions if it was not clearly documented, and explained - or even if they could take the impression it was, how likely it is we are breaking something sersious.
-
How difficult it is to recover for our users. If the system is failing immediately and what the user needs to do is flipping the flag to bring back the old behaviour - is it breaking or not? If the system is not failing but the change in behaviour is not persistent nor dangerous and the user might bring it back with a flip of a flag - is it breaking or not?
And yes - it means we will sometimes have to make arbitrary decisions based on gut feelings not data nor precise rules followed. And yes - it means that sometimes there will be individual angry users who will tells us "but you promised backwards compatibiliity - bring it back NOW", and there will be cases where we disageree between ourselves - maintainers - what is backwards compatible and what is. not and we will have to vote on it eventually. And yes - sometimes it will mean we will take a wrong decisiion and break too many workflows of too many users and we will have to quickly release a bugfix that will revert it.
All this. And more. And we will remains humans making sometimes flawed and imperfect decisions based on our insticts and intentions and gut feelings not data and strict rules - rather than robots following precise rules and prescribed algorithms. I think this is why we - as maintainers are still needed in the project - to make such decisions.
Sorry If I've gotten a bit too philosophical, but I do think we are quite too often trying to make things crystal clear and be free of making the decisions so that we don't have to well, make decisions.
It's needed in many cases - that's why I am also adding a lot of rules on how we approach things - for example provider's maintenance lifecycale. But I treat it more as communication tool and write down our intentions and where possible leave enough room for interpretation and decision making.
Where we can - yes we should make clear rule. But when we can't we should state our intentions, communicate general principles, and simply try - as best as we can - to fulfill those stated intentions (but we should attempt to communicate those intentions so that our users are aware of them).
if task_instance.next_method: | ||
if task_instance.next_method: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hello, coming long after the fact, but I think that during the move, this check was duplicated
# then we need to pick the right method to come back to, otherwise | ||
# we go for the default execute | ||
execute_callable_kwargs = {} | ||
if self.next_method: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where the code from my comment above is coming from, there is only one check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. No harm but would be great to fix it. Care for opening PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fundamentally what's going on here is we need a TaskInstance object instead of a Row object when sending over the wire in RPC call. But the full story on this one is actually somewhat complicated. It was back in 2.2.0 in apache#25312 when we converted to query with the column attrs instead of the TI object (apache#28900 only refactored this logic into a function). The reason was to avoid locking the dag_run table since TI newly had a dag_run relationship attr. Now, this causes a problem with AIP-44 because the RPC api does not know how to serialize a Row object. This PR switches back to querying a TaskInstance object, but avoids locking dag_run by using lazy_load option. Meanwhile, since try_number is a horrible attribute (which gives you a different answer depending on the state), we have to switch it back to look at the underlying private attr instead of the public accesor.
Fundamentally what's going on here is we need a TaskInstance object instead of a Row object when sending over the wire in RPC call. But the full story on this one is actually somewhat complicated. It was back in 2.2.0 in #25312 when we converted to query with the column attrs instead of the TI object (#28900 only refactored this logic into a function). The reason was to avoid locking the dag_run table since TI newly had a dag_run relationship attr. Now, this causes a problem with AIP-44 because the RPC api does not know how to serialize a Row object. This PR switches back to querying a TaskInstance object, but avoids locking dag_run by using lazy_load option. Meanwhile, since try_number is a horrible attribute (which gives you a different answer depending on the state), we have to switch it back to look at the underlying private attr instead of the public accesor.
Closes #28269
Closes #28784
Add the following methods to the internal API component: