-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise filtering ID columns #10860
Comments
I'm all for speeding up queries, but I wouldn't throw a hard exception if any of the IDs weren't numeric, just fallback to the old system. |
Yeah that's a fair point, some website may have a higher up mandate using Have updated description |
This is about
I don't know if that wider scope should be done or not - but if it is, then instead of relying on the name of the field we could (and probably should) check if the field type is |
Merged, ACs met. |
There's still an open PR attached to this thing. |
Oops. Merged it |
When peer reviewing #10855 looking at query speed it was discovered that
DataList::byIDs($id)
, which simply uses$this->filter('ID', $ids)
, generates this style of MySQL query with placeholders:SELECT DISTINCT "Player"."ClassName", "Player"."LastEdited", "Player"."Created", "Player"."Name", "Player"."ID", CASE WHEN "Player"."ClassName" IS NOT NULL THEN "Player"."ClassName" ELSE 'Player' END AS "RecordClassName" FROM "Player" WHERE ("Player"."ID" IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ... )
Which for 10k records took [0.1772s, 0.1981s, 0.1684s] = avg 0.1812s
However simply updating the query to this
->where(sprintf('"Player"."ID" in (%s)', implode(', ', $ids)))
Which uses this style of query without placeholders
SELECT DISTINCT "Player"."ClassName", "Player"."LastEdited", "Player"."Created", "Player"."Name", "Player"."ID", CASE WHEN "Player"."ClassName" IS NOT NULL THEN "Player"."ClassName" ELSE 'Player' END AS "RecordClassName" FROM "Player" WHERE ("Player"."ID" in (1, 2, 3, 4, 5, 6, 7, 8, ... )
Resulted in far better performance - [0.0615s, 0.0602s, 0.0606s] = avg 0.0608s
So we should look to use this far more optimal non-placeholder version when only filtering by IDs due to the 3x performance gain that will impact many areas of the CMS.
A few things to be careful of (some of these will turn into ACs)
throw a hard exceptionfallback to the old method, e.g. -- while we COULD do this for non ints, I think it makes sense not to because if you're chosen to use non int IDs then you'll have a very strong security focus and having queries with a bunch of?
in them is going to be more secure than queries that include the ID as plain text.Note: prepared statements are should be faster if you have them in an optimised setup and the same query is being run many times. However this won't always be the case and wasn't the case on my local. For Silverstripe which is designed to be an easy to setup and accessible system it seems we should probably assume "not an optimised setup". And even in optimised setups they can still be slower e.g. here's an example of someone who knows what they're doing intentionally turning them off - https://orangematter.solarwinds.com/2014/11/19/analyzing-prepared-statement-performance/
Acceptance
DBPRimaryKey
orDBForeignKey
. But we validate that the value is an integer.Notes
PRs
The text was updated successfully, but these errors were encountered: