-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIP-51] Executors vending CLI commands #29055
[AIP-51] Executors vending CLI commands #29055
Conversation
31e4b9a
to
689cd36
Compare
airflow/cli/cli_parser.py
Outdated
|
||
airflow_commands = core_commands | ||
for executor in ExecutorLoader.import_all_executor_classes(silent=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implication of moving the executor cli commands to the executor modules is that if the executor module fails to load, then its cli commands won't be included in Help output. I think this is actually a fair behaviour for third party executors and even the core executors that are built into airflow. However, it is not the current behavior we have today, which is that the executor cli commands will still show up in help if they are not importable, but you'll get either warnings or stacktraces when you actually try to use them (since at that point you'll hit the importing issues). We try to catch those traces and log in this DefaultHelpParser._check_value
method, but it only looks for the celery dependency (and there could be many other ways that the module fails to load).
I'm not yet personally decided on what the right approach is here. I'd appreciate input from other folks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to mitigate this, in a generalized way, would be to catch import errors for the executors as we loop through them looking for CLI commands, and then log to the user that they could not be imported, which may mean some cli commands are not available for those executors and the user should first fix the issues if they wish to use those commands.
@potiuk @pierrejeambrun @eladkal any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine. We already have precedent which is similar but even worse. When we have no CeleryExecutor configured, running "celery" command group yields this:
[jarek:~/code/airflow] [airflow-3.7] fix-infinite-nofiles-from-containerd+ 2s 130 ± airflow celery
Usage: airflow [-h] GROUP_OR_COMMAND ...
Positional Arguments:
GROUP_OR_COMMAND
Groups:
celery Celery components
config View configuration
.....
Optional Arguments:
-h, --help show this help message and exit
airflow command error: argument GROUP_OR_COMMAND: celery subcommand works only with CeleryExecutor, CeleryKubernetesExecutor and executors derived from them, your current executor: SequentialExecutor, subclassed from: BaseExecutor, see help above.
I found it annoying when testing some stuff but I never got to a good solution. And when I think about it - I am more annoyed by the fact that I see the command group but I cannot actually see what subcommands it has than anything else. And this is more of a maintainer problem than user problem.
And I even thik in this case your proposal combined with moving the celery executors to providers - we could get rid of the above warning (or rather move it to each of the subcommands) and that would solve the "annoyance" I had. This way you would only see celery command when you have "celery provider" installed (once we move the executor to celery) and not when you have it configured (and I'd love it). This is IMHO much more consistent approach
I think this is also a nice way to actually finally decouple executors - this is a nice thing that the built-in executor CLIs will add new CLI commands.
We still have the documenttion generated for all the built-in executors, so even if they are missing in the main CLI that's more than enough IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate the time to read the thread and send a thoughtful reply @potiuk 🙏
Unless anyone else disagrees I'll go ahead with this approach.
There will also need to be a followup PR which removes some of the error handling we attempt to do (basically the thing that produces that message you ran into). Core airflow should not be attempting to do this, I think we agree on that, but correct me if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Core airflow should not be attempting to do this, I think we agree on that, but correct me if not.
Yes. we agree.
c47b170
to
18937e4
Compare
18937e4
to
7c6ab0d
Compare
(sorry for looong delay). Yes. All that is cool - and I think we should simply split it into three PRs to do a final review :) |
Thanks for checking in on this one @potiuk. And that's a fair callout. I usually try to keep PRs small and tactical, but this one felt cohesive. BUT, I think I can at least separate the first comment from the second two and they would still make sense to merge that way. BTW, did you have a read through the comment thread above about the CLI Help output? I'm curious to hear your perspective on that. |
BTW. It's ok to merge it as a single squashed PR I think |
airflow/cli/cli_parser.py
Outdated
except Exception: | ||
executor_name = ExecutorLoader.get_default_executor_name() | ||
log.exception("Failed to load CLI commands from executor: %s", executor_name) | ||
log.error("Ensure all dependencies are met and try again") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to at least print the mesage of the exception that caused it. Otherwise the user will be completely unaware what might be the reason and when the user raises issue "my airflow does not work, what should I do?" - not even the user but also whoever looks at it in GitHub issues will have no clue either, because the actual exception will be swallowed and we will have no clue how to help the user.
Ideally we should also separately handle ImportError specifically, separately. Having an ImportError is a clear indication that the user has no right provider installed or no provider at alll. This will be probably 9X% of import errors here, so we can give the user a much more specific error:
- first of all print the exact exception message that might give the exact import error.
- We can add a generic advice: "If your executor is based on Celery - you need to install 3.3.0+ version of provider, if it is Kubernetes - yoy need to install 7.4.0+ of cncf.kubernetes. While we cannot be sure if this is the reason, this will be it in vast majority of cases and might significantly incrase the number of cases where the users will be able to install provider and solve the problem on their own rather than opening an issue to Airflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to at least print the mesage of the exception that caused it.
Yeah, that's fair, and I did try it with this configuration (as well as printing the full stack trace) but it really makes the output of the CLI commands ugly, and if the user doesn't even care about executor commands they're going to see the exception and a large warning every time which is a bad user experience as well.
But I can try add it back in again and see how it looks.
We can add a generic advice: "If your executor is based on Celery - you need to install 3.3.0+ version of provider, if it is Kubernetes - yoy need to install 7.4.0+ of cncf.kubernetes. While we cannot be sure if this is the reason, this will be it in vast majority of cases and might significantly incrase the number of cases where the users will be able to install provider and solve the problem on their own rather than opening an issue to Airflow.
It feels a bit wrong to me to add this much celery/kubernetes specific hardcoding into this generic executor code. I'll try update the general exception message I print here to mention the celery/kube versions, but I think we should remove those soon in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, actually, after reviewing this I'm realizing I actually already solved this by using log.exception
. That includes the exception message along with the log message. So I think we're all good on that front.
But I'll add a bit more info to the following message I display to have some info about celery/kubernetes
@potiuk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a screenshot of how it will look. Some things to note:
- I start by uninstalling a dependency of Celery
- Notice how if the executor is something like LocalExecutor, we get no warnings and everything works just fine 😃
- If we switch to CeleryExecutor, we get:
- A message saying we failed to load CLI commands from executor
- The traceback and Exception message
- And the helpful tip (now updated with more details from your suggestion @potiuk)
- The CLI command still works despite the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. That loooks cool. Even if ti is a bit ugly, this is a very good message to act on by the user (and if the user will create an issue - which will inevitably happen - "my command failed, what do I do" - we can simply respond -" just follow what is written in the message" and close such issue. (And I am not complaining, this is a fact of life that no matter what you do, some users will not read those error messages - but at least they will cop&paste it so that you can read it and ask the user to read it)l
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: note @o-nikolas how much the "specific" message is needed.
Put yourself in the shoes of a new user, who read somewhere that in order to use Celery executor, they need to set the env variable (but did not have celery provider installed and has no idea it needs to be added - because they have not read and digested the docs, they are just experimenting).
If you just print the exception and no extra message, the user would be at a total loss on what to do because the import error would not be telling anythong to the user who would not know that they have to installl celery provider. They would just see import error and no further explanation. With the extra message, we make it very clear what the problem very likely is and give the user clear instruction what to do to fix the problem.
Without it, we invite the uer to simply open an issue to Airflow and say "Airflow does not work - I have that exception, what do I do ?". With this extra message we - in vast majority of cases - prevent that question even being asked because we provide an answer to it before the user asks it.
tests/providers/celery/executors/test_celery_kubernetes_executor.py
Outdated
Show resolved
Hide resolved
Added my comments - looks good, but I think we should slightly improve diagnostics of problems and typing hints. |
BTW. I really like how it makes the core even more "lean" without provider's discovery. That's really good observation that we ONLY need chosen executor's command set and "other's" are not needed. BTW2. You will also have to handle conditional imports of the The easiest way to handle it @o-nikolas - is to try/except Import error the imports and check if airfow version is < 2.7.0 and raise |
😃
Okay, I'll try whip this up today. What message should we add in the |
Fantastic! |
What you proposed was good :) |
Small follow-up after apache#29055 - for better diagnostic of potential future problems, it would be good to re-raise the original import error, otherwise if the Import error results from some other issue than Airflow version, we will get quite a bit of head scratching trying to diagnose some of the resulting aftermath.
Small follow-up after #29055 - for better diagnostic of potential future problems, it would be good to re-raise the original import error, otherwise if the Import error results from some other issue than Airflow version, we will get quite a bit of head scratching trying to diagnose some of the resulting aftermath.
The apache#29055 moved relevant CLI command definition to executors but it automatically removed the command from generated documentation. This PR brings it back for both Celery and Cncf Kubernetes provider Closes: apache#27932
This PR addresses executor coupling 5a and 5b in Airflow CLI. See this issue (#27932) for more context.
Please read before reviewing:
Change summary:
Updates to the executor interface (i.e. add stub methods to the base executor and the two hybrid executors which don't inherit directly from base) to vend cli commands and make small changes in the cli_parser.py module to read in commands from this new method. Add tests for the interface.
Move executor related cli commands (for Celery and Kubernetes) to their respective executor modules utilizing the new interface from step 2. All original tests for these CLI commands still apply.
NOTE
After this PR, CLI commands will only be displayed for the currently configured executor. This solves some technical issues around parsing and import cycles, as well as finally puts to rest many strange behaviours people would have with using CLI commands for executors they do not have installed which often yielded broken behaviour due to missing dependencies.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.