Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executor Import/Load time optimization #30361

Merged

Conversation

o-nikolas
Copy link
Contributor

@o-nikolas o-nikolas commented Mar 29, 2023

Overview

This PR aims to improve the time it takes to load/import the various executor modules we have in Airflow.

Motivation

The executors are imported in more places now that various compatibility checks are in core Airflow code (re: AIP-51). Also, decreasing import times is very important for the work of executors vending CLI commands (see #29055), since the CLI code in Airflow is particularly sensitive to slow imports (because all code is loaded fresh each time you run an individual Airflow CLI command).

The changes

This PR mostly includes changes to move some expensive imports that are only used for type checking under the TYPE_CHECKING flag so that they are not run at runtime. As well as moving a select few expensive imports closer to the code which uses them.
The most important changes are in the BaseExecutor module since all other executors load this module, and so benefits made here propagate outward.

Testing

I benchmarked these changes by writing a script to import the various executor modules in a fresh python runtime and timing how long that takes (you can test this yourself quickly from a bash shell by doing something like time python -c 'from airflow.executors.local_executor import LocalExecutor'). Then doing that in a loop for several samples (with some randomness in the order for fairness) both on main and on my development branch.

Results

Screenshot from 2023-03-29 12-39-25

Most executors saw a ~50% speed increase. Kubernetes, and to a lesser extent Celery, are still quite slow and will need more changes specifically targeted to those modules (in a future PR).
The combined executors (e.g. LocalKubernetesExecutor) saw less gains since they import two executors each, so they're paying double the cost (so they saw half the gains, 25%)


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added provider:cncf-kubernetes Kubernetes provider related issues area:Scheduler including HA (high availability) scheduler labels Mar 29, 2023
@o-nikolas o-nikolas requested review from potiuk and uranusjr March 29, 2023 19:43
@o-nikolas o-nikolas changed the title Type-related import optimization for Executors Executor Import/Load time optimization Mar 29, 2023
@o-nikolas
Copy link
Contributor Author

Fresh rebase from main and this is still green. Anyone have time to review/approve? (CC @uranusjr @potiuk @eladkal)

Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still suspect some of the local imports are not necessary, but those are simple enough as mentioned and therefore introduce no downsides.

Move some expensive typing related imports to be under TYPE_CHECKING
@o-nikolas o-nikolas force-pushed the onikolas/executors_type_import_optimization branch from aeeae43 to 1eac90f Compare April 6, 2023 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler provider:cncf-kubernetes Kubernetes provider related issues type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants