Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL consider rewrite subquery to nested query with order by - 1000x+ performance improve #77

Closed
wenerme opened this issue Feb 19, 2021 · 7 comments

Comments

@wenerme
Copy link
Contributor

wenerme commented Feb 19, 2021

original

SELECT (
           SELECT MAX(rkv.id) AS id
           FROM kine AS rkv),
       (
           SELECT MAX(crkv.prev_revision) AS prev_revision
           FROM kine AS crkv
           WHERE crkv.name = 'compact_rev_key'),
       kv.id AS theid,
       kv.name,
       kv.created,
       kv.deleted,
       kv.create_revision,
       kv.prev_revision,
       kv.lease,
       kv.value,
       kv.old_value
FROM kine AS kv
         JOIN (
    SELECT MAX(mkv.id) AS id
    FROM kine AS mkv
    WHERE mkv.name LIKE '/registry/events/%'
    GROUP BY mkv.name) maxkv
              ON maxkv.id = kv.id
WHERE (kv.deleted = 0 OR 'f')
ORDER BY kv.id ASC
LIMIT 2;

execute in 53s

by using nested - same result

select *
from (SELECT (
                 SELECT MAX(rkv.id) AS id
                 FROM kine AS rkv),
             (
                 SELECT MAX(crkv.prev_revision) AS prev_revision
                 FROM kine AS crkv
                 WHERE crkv.name = 'compact_rev_key'),
             kv.id AS theid,
             kv.name,
             kv.created,
             kv.deleted,
             kv.create_revision,
             kv.prev_revision,
             kv.lease,
             kv.value,
             kv.old_value
      FROM kine AS kv
               JOIN (
          SELECT MAX(mkv.id) AS id
          FROM kine AS mkv
          WHERE mkv.name LIKE '/registry/events/%'
          GROUP BY mkv.name) maxkv
                    ON maxkv.id = kv.id
      WHERE (kv.deleted = 0 OR 'f')) t
ORDER BY id ASC
LIMIT 2;

execute in 4ms

full analyse https://explain.depesz.com/s/HUEf

@wenerme
Copy link
Contributor Author

wenerme commented Feb 19, 2021

seems this query cause the problem

listSQL = fmt.Sprintf(`
SELECT (%s), (%s), %s
FROM kine AS kv
JOIN (
SELECT MAX(mkv.id) AS id
FROM kine AS mkv
WHERE
mkv.name LIKE ?
%%s
GROUP BY mkv.name) maxkv
ON maxkv.id = kv.id
WHERE
(kv.deleted = 0 OR ?)
ORDER BY kv.id ASC
`, revSQL, compactRevSQL, columns)

subquery first, then agg

wenerme added a commit to wenerme/kine that referenced this issue Feb 19, 2021
wenerme added a commit to wenerme/kine that referenced this issue Feb 19, 2021
@brandond
Copy link
Member

brandond commented Feb 19, 2021

What version of postgres did you test this on? I've noticed that newer versions perform much better; it's possible that the query optimizer in these versions handles the current query better.

Also, I see that your PR changed the generic query - have you tested it on sqlite and MySQL as well?

@wenerme
Copy link
Contributor Author

wenerme commented Feb 19, 2021

PostgreSQL 13.1 on x86_64-pc-linux-musl, compiled by gcc (Alpine 9.3.0) 9.3.0, 64-bit

I use aliyun rds pg before, found the performance problem, then migrate to self host pg 13.

@wenerme
Copy link
Contributor Author

wenerme commented Feb 19, 2021

@brandond not tested on sqlite and MySQL, but I expected to see same performance if the engine already handle the optimization on this situation. If the subquery not materialized, then the performance should better than before.I think this syntax is generic enough, should not cause any problem.

@wenerme
Copy link
Contributor Author

wenerme commented Feb 19, 2021

This is another cluster using aliyun rds pg, a lot slow query

image

@brandond
Copy link
Member

I like the PR but I'm going to wait on adding some automated tests against a couple different DB engines and versions before I merge it.

wenerme added a commit to wenerme/kine that referenced this issue Mar 30, 2021
wenerme added a commit to wenerme/kine that referenced this issue Mar 30, 2021
wenerme added a commit to wenerme/kine that referenced this issue Mar 30, 2021
wenerme added a commit to wenerme/kine that referenced this issue Mar 31, 2021
@jgreat
Copy link

jgreat commented Feb 2, 2022

I'm seeing k3s server throwing a lot of "TRACE" messages for long transaction commit times when using Postgres as a backend. Could this issue be related?

Any update on PR #78 to fix it?

brandond pushed a commit to wenerme/kine that referenced this issue Feb 3, 2022
wenerme added a commit to wenerme/kine that referenced this issue Feb 3, 2022
Address minor style nits

Signed-off-by: Brad Davidson <[email protected]>
Signed-off-by: 陈杨文 <[email protected]>
waynelwz pushed a commit to waynelwz/kine that referenced this issue Mar 8, 2022
Address minor style nits

Signed-off-by: Brad Davidson <[email protected]>
Signed-off-by: 陈杨文 <[email protected]>
Signed-off-by: waynelwz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants