Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glossary: Improve performance of parsing translations for adding tooltips #1395

Merged
merged 4 commits into from
Apr 7, 2022

Conversation

akirk
Copy link
Member

@akirk akirk commented Apr 6, 2022

Closes #1383.

This improves the implementation of #1359 for a better performance. The key improvement is shortening the regex for splitting the singular into chunks of text and glossary items.

Before, for example for Galician, it created a 20k regex of the format:

(\bauthentication\b)|(\bauthentications\b)|(\bauthenticationes\b)|(\bauthenticationed\b)|(\bauthenticationing\b)|(\bminification\b)|(\bminifications\b)|(\bminificationes\b)|(\bminificationed\b)|(\bminificationing\b)|(\bnotification\b)|(\bnotifications\b)|...

The new code improves it to a 6k regex like this:

\b(authentication(?:s|es|ed|ing)?|authenticate(?:s|es|ed|ing)?|...

Before, the regex performance was about 0.3s per preg_split() call, the new one is < 0.1s.

I also refactored the regex generation to happen inside the function cache it interally using the static keyword and also the generation of the reverse lookup to identify the assigned glossary term.

@akirk akirk requested review from ocean90 and pedro-mendonca April 6, 2022 14:53
@akirk akirk force-pushed the debug-1383 branch 2 times, most recently from d2c2107 to 93eb98d Compare April 6, 2022 14:58
@pedro-mendonca
Copy link
Member

pedro-mendonca commented Apr 6, 2022

Here are my results for a full WP 5.9 dev project with 5148 strings and a glossary of 271 entries.

500 strings per page

I had a 28 seconds page load
This PR reduced to 6 seconds

All the 5148 strings on a single page

I had 337 seconds page load
This PR reduced to 38 seconds

@ocean90 ocean90 added this to the 3.0 milestone Apr 6, 2022
@ocean90
Copy link
Member

ocean90 commented Apr 6, 2022

Thanks for working on this! Definitely performs better than what we currently have.

I only noticed one minor difference when I have an entry for "comments" and "comment":

Before  After
Bildschirmfoto 2022-04-06 um 20 31 35 Bildschirmfoto 2022-04-06 um 20 32 36

The list is now inverted where the most relevant entry is now the last item.

gp-templates/helper-functions.php Outdated Show resolved Hide resolved
gp-templates/helper-functions.php Outdated Show resolved Hide resolved
@ocean90 ocean90 merged commit 0093673 into develop Apr 7, 2022
@ocean90 ocean90 deleted the debug-1383 branch April 7, 2022 07:16
@pedro-mendonca pedro-mendonca removed their request for review April 7, 2022 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Glossary: Improve performance of parsing translations for adding tooltips
3 participants