-
-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve subresources DB query performance by removing a subquery #3396
Improve subresources DB query performance by removing a subquery #3396
Conversation
That's good, I've covered the edge cases of this in the behat test suite. The queries you show give the same results, therefore no BC break will outcome of this change. You can change the broken phpunit tests by fixing the relevant queries. |
Thanks for doing this! 🎉 |
1e56eaf
to
d6e3f49
Compare
…HERE <identifier = ..." to improve performance
d6e3f49
to
df4ad61
Compare
Fixed the PHPUnit tests, and rebased against latest If the comment I added after @dkarlovi feedback is enough, I'd say this is ready to be merged. Note: Behat tests are still failing, but because of either ccode coverage generation failing (?) or call to |
5503a74
to
fa7b217
Compare
Yeah, those are unrelated. Please ignore those failures. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some tests for this? (How?)
Or are the existing tests sufficient?
@teohhanhui Behat tests already covers the correctness of the query results (thanks to @alanpoulain) Is there a point in checking the generated DQL itself? I doubt so. This is a perf improvement on the query level. |
Co-Authored-By: Alan Poulain <[email protected]>
Co-Authored-By: Teoh Han Hui <[email protected]>
Thanks @clemherreman! 🎉 |
Please have a look at #1542, As it breaks custom identifiers, feel free to review. |
Abstract
This PR try to change DQL queries generated to fetch subresource, to improve performance on large tables.
Before
The generated SQL to query a subresource uses a
WHERE ... IN ( <subquery> )
, in order to be recursive when subresources also have subresources (introduced by #1608).The subquery, on large table (at work 4M+ lines), takes a significant amount of time (2-5s), even with the proper SQL indexes (at least on MySQL).
After
This PRs tries to optimize for the common case, having only one level of subresources, by replacing the subquery via a direct
WHERE <identifier> = <identifier-alias>
only on the last recursion level. This means it shouldn't break subresources that have subresources, while still improving the query for the nominal cases.This significantly reduces the SQL query time, going from 2.45s to 15ms on my e-commerce app.
BC & missing things.
I broke the unit tests, mainly the mocking, but also some tests fails as they compare the previously generated query, with the new one (which makes sense). All the Behat tests are green locally though.
I feel out of my depth here, and if someone would help me writing those, ensuring no BC is done, is would be great.
On my side, I run our rather large behat test suite against my fork, and not tests complained about wrong results, so I am fairly confident I haven't introduced some BC.
TODO