-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of default GP_Translation->for_translation() query #376
Comments
The main difference is LEFT JOIN vs INNER JOIN: https://github.com/GlotPress/GlotPress-WP/blob/1401774181b96d011b03329ec7e5c0e7f7edb1db/gp-includes/things/translation.php#L198. |
I don't think we can use an inner join in those cases as it will return different results if no translation row exists. For example, when you create a new translation set, no rows are added to the translations table, so the inner join would return 0 rows (there are no rows common between the originals and translations tables), where as the left join would return all the rows from the originals table. |
I think a compound index on the translations table for |
Thanks for your comprehensive analysis, @akirk!
I took a look at when those were introduced:
Isn't that because all originals are mapped with a translation and then the rows without a translation (all fields are NULL) are selected? |
Ah, you are right. My bad, I didn't consider that the row could come in as NULL from the join. |
Proposed indexes in SQL: ALTER TABLE translate_translations ADD INDEX test1(translation_set_id, date_added);
ALTER TABLE translate_translations ADD INDEX test2(original_id, translation_set_id, status);
ALTER TABLE translate_originals ADD INDEX test3(project_id, status, priority, date_added); |
Here are my findings, running the queries on a big test database on a quite fast machine (SSD, Core i7 3GHz, 16GB RAM):
Query 1 mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS t., o., t.id as id, o.id as original_id, t.status as translation_status, o.status as original_status, t.date_added as translation_added, o.date_added as original_added FROM translate_originals as o LEFT JOIN translate_translations AS t ON o.id = t.original_id AND t.translation_set_id = 3898 AND t.status != "rejected" AND t.status != "old" AND (t.status = 'current' OR t.status = 'waiting' OR t.status = 'fuzzy') WHERE o.project_id = 78 AND o.status ='+active' ORDER BY t.date_added DESC LIMIT 20 OFFSET 0\G Real query: 20 rows in set (0.05/0.05/0.05/0.06/0.04 sec) Query 2 Real query: 20 rows in set (0.06/0.07/0.07/0.07/0.06 sec) Query 3 Real query: 20 rows in set (0.04/0.04/0.04/0.03/0.04 sec) |
There are only two status for originals: `+active` and `-original`. There is no need to perform a fuzzy search which affects the performance of the GP_Translation->for_translation()` query. See #376.
…r_translation()`. Props @akirk for the analysis, see #376 (comment). See #376.
…r_translation()`. Props @akirk for the analysis, see #376 (comment). See #376.
There are only two status for originals: `+active` and `-original`. There is no need to perform a fuzzy search which affects the performance of the GP_Translation->for_translation()` query. See GlotPress#376.
…r_translation()`. Props @akirk for the analysis, see #376 (comment). See #376.
…r_translation()`. Props @akirk for the analysis, see GlotPress#376 (comment). See GlotPress#376.
…r_translation()`. Props @akirk for the analysis, see GlotPress#376 (comment). See GlotPress#376.
The query for https://translate.wordpress.org/projects/wp/dev/admin/de/formal takes between 8 and 10 seconds:
The query for https://translate.wordpress.org/projects/wp/dev/admin/de/formal?filters%5Bstatus%5D=untranslated&sort%5Bby%5D=priority&sort%5Bhow%5D=desc takes between 8 and 15 seconds:
The query for https://translate.wordpress.org/projects/wp/dev/admin/de/formal?filters%5Btranslated%5D=yes&filters%5Bstatus%5D=current takes between 500ms and 2 seconds:
The text was updated successfully, but these errors were encountered: