Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up regex match time for custom warnings #40513

Merged
merged 2 commits into from
Jul 1, 2024

Conversation

amoghrajesh
Copy link
Contributor

Using re module instead of re2 is much slower when it comes to regex parsing for custom warning messages. Ran the tests with sample data collected with chars of length: [10000, 20000, 40000, 80000, 160000] and the data is like this:

re
10000 - 0.771
20000 - 2.832
40000 - 11.319
80000 - 45.899
160000 - 181.397

re2
10000 - 0.374
20000 - 1.232
40000 - 4.882
80000 - 19.732
160000 - 79.046

The plot looks like this: (re vs re2) (input size vs time taken)
image

This PR overrides _escape in rich.markup's escape with re2 equivalent


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@amoghrajesh
Copy link
Contributor Author

Thanks for the reviews. Merging it

@amoghrajesh amoghrajesh merged commit a37109c into apache:main Jul 1, 2024
52 checks passed
@utkarsharma2 utkarsharma2 added the type:improvement Changelog: Improvements label Jul 12, 2024
@utkarsharma2 utkarsharma2 added this to the Airflow 2.10.0 milestone Jul 12, 2024
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants