-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize space supporter filter for v2 calls #3077
Optimize space supporter filter for v2 calls #3077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference between your query and the original one is the from_self: false
parameter to the union
method invocations. That's why the original query is nested (e.g. SELECT * FROM ( SELECT * FROM ...
) and yours is not.
I don't think that .all
vs. .any?
makes a huge difference here, as the returned list from the old query consists of 0 to 7 type strings. That's why I would be in favor of keeping the original single query instead of having (up to) two queries.
Of course, adding the from_self: false
to the original query would still make sense.
So I think the difference is that with But the main thing is that while space supporter roles can apparently be quite common (almost 10% of all roles on this foundation), users who only have a space supporter role seem to be very rare. In fact, this foundation has just one space supporter role (out of 60k) that belongs to a user who has no other role (other than organization user, since any user with a space role has that too). That means:
|
@philippthun Your feedback made me ponder this a bit more deeply, and I think I've come up with a much better solution in 2db8b34, which I hope will turn your frown upside down. Although it still retains the separate query for space supporters 😉 My latest commit produces a SQL query that looks like this: SELECT 1 AS "one"
FROM "users"
WHERE ( ( ( EXISTS (SELECT 1
FROM "spaces_developers"
WHERE ( "user_id" = 2023961 )) ) IS TRUE )
OR ( ( EXISTS (SELECT 1
FROM "organizations_managers"
WHERE ( "user_id" = 2023961 )) ) IS TRUE )
OR ( ( EXISTS (SELECT 1
FROM "spaces_managers"
WHERE ( "user_id" = 2023961 )) ) IS TRUE )
OR ( ( EXISTS (SELECT 1
FROM "spaces_auditors"
WHERE ( "user_id" = 2023961 )) ) IS TRUE )
OR ( ( EXISTS (SELECT 1
FROM "organizations_auditors"
WHERE ( "user_id" = 2023961 )) ) IS TRUE )
OR ( ( EXISTS (SELECT 1
FROM "organizations_billing_managers"
WHERE ( "user_id" = 2023961 )) ) IS TRUE ) )
LIMIT 1; The cool thing about it is that, unlike with a For each of the tables below, I retrieved a sample user_id for a user who had that role only (ignoring OrganizationUser) and then used Here's for the old query:
And here's for the new one that I committed in 2db8b34:
I doubt I'm going to have time to do it, but it occurs to me that the same approach could also make quite a big difference in the |
c9dfe5c
to
44029b8
Compare
It looks like there is something that old MySQL servers don't like:
I hope you have the correct manual at hand ;-) |
Whaaaat |
Alright this seems to be a bug in Sequel, which generates a query that is invalid for mysql 5.7. Though the reference manuals for mysql 5.7 and 8.0 don't hint at any such change in behaviour, the former will apparently explode if you use a mysql> SELECT
-> 1
-> WHERE
-> (
-> EXISTS (
-> SELECT
-> 1
-> FROM
-> spaces_developers
-> WHERE
-> 'user_id' = 1
-> LIMIT
-> 1
-> )
-> );
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE But with a useless mysql> SELECT
-> 1 FROM users
-> WHERE
-> (
-> EXISTS (
-> SELECT
-> 1
-> FROM
-> spaces_developers
-> WHERE
-> 'user_id' = 1
-> LIMIT
-> 1
-> )
-> );
Empty set, 1 warning (0.00 sec) @philippthun I've just opened jeremyevans/sequel#1977 to report this apparent bug. Perhaps we could revert to my original |
Yes, let's do this. |
The alternative Sequel method chain that this commit reverts produces a possibly-more-efficient `SELECT 1 WHERE` query, but this is invalid syntax for mysql 5.7. See cloudfoundry#1977 for further discussion.
UPDATE: I've completely changed this optimization since first opening the PR so this description is out of date (I'm leaving it as a record of how this has progressed) - see comments below for latest details
A short explanation of the proposed change:
Optimizes the checks that prevent users with only a space supporter role from accessing the v2 API, cutting around 7% off the overall querytime when there are large tables.
This PR uses Sequel's
any?
method instead of the existing check, which retrieves.all
of a given user's roles and then, in the CC's ruby code, counts them to see whether the user has exactly 1 space supporter role and zero of all other role types.The method will also now return false right away, without querying the space supporters table, if the user has any non-supporter role.
An explanation of the use cases your change solves
The existing check currently adds a pretty significant delay to every v2 call on large foundations, and this change slightly mitigates that.
I've run the old and the new queries in postgres with
EXPLAIN ANALYZE
100 times each on a foundation with about 75k users and 646k roles. Averages from the results are presented in the table below (times in ms):If the user lacks any non-supporter role, we then have to separately check the supporters table. That little query has a planning time of 0.186ms, and execution 1.366ms. So we're still a smidgen faster even for the minority of requests that also need to execute that.
So it's a pretty modest improvement, but given that this is middleware that gets executed for all v2 calls, I think cumulatively it should be a useful saving.
Links to any other associated PRs
n/a
I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the
main
branchI have run all the unit tests using
bundle exec rake
I have run CF Acceptance Tests