Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: regression on query duration against system tables #88891

Closed
ZhouXing19 opened this issue Sep 28, 2022 · 4 comments · Fixed by #104547
Closed

*: regression on query duration against system tables #88891

ZhouXing19 opened this issue Sep 28, 2022 · 4 comments · Fixed by #104547
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@ZhouXing19
Copy link
Collaborator

ZhouXing19 commented Sep 28, 2022

Related: ticket(internal link) and #72407

To Reproduce

SELECT
	-- This field corresponds to the DBTableMetadata Haskell type
	jsonb_build_object(
		'oid', "table".oid::INT8,
		'columns', COALESCE(columns.info, '[]')
	)::JSONB AS info
FROM
	pg_catalog.pg_class AS "table"
	JOIN pg_catalog.pg_namespace AS schema ON schema.oid = "table".relnamespace
	-- description
	LEFT JOIN pg_catalog.pg_description AS description ON
			description.classoid = 'pg_catalog.pg_class'::REGCLASS
			AND description.objoid = "table".oid
			AND description.objsubid = 0
	-- columns
	LEFT JOIN LATERAL (
			SELECT
				jsonb_agg(
					jsonb_build_object(
						'description', pg_catalog.col_description("table".oid, "column".attnum)
					)
				)
					AS info
			FROM
				pg_catalog.pg_attribute AS "column"
				LEFT JOIN pg_catalog.pg_type AS type ON type.oid = "column".atttypid
				LEFT JOIN pg_catalog.pg_type AS base_type ON
						type.typtype = 'd' AND base_type.oid = type.typbasetype
			WHERE
				"column".attrelid = "table".oid
				-- columns where attnum <= 0 are special, system-defined columns
				AND "column".attnum > 0
				-- dropped columns still exist in the system catalog as "zombie"columns, so ignore those
				AND NOT "column".attisdropped
		)
			AS columns ON true
WHERE
	"table".relkind IN ('r')

This query finishes instantly on Postgres14 but takes a long time (~8mins) in single-node CRDB.

Also, I ran this query against on v22.2.0-beta-1, v22.1.7 and v21.1.21 single-node CRDB, and found a regression in the query duration.

  • v21.1.21: 874.061s with trace, 506.836s w/o trace
  • v22.1.7: 1893.165s with trace, 1159.774s w/o trace
  • v22.2.0-beta.1: 1995.032s with trace, 1220.553s w/o trace

stmt bundles are:

Expected behavior
Regression should not happen, and the query should not take that long.

Jira issue: CRDB-20025

@ZhouXing19 ZhouXing19 added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Sep 28, 2022
@DrewKimball
Copy link
Collaborator

It looks like the 22.1 apply-join produces 2642 rows, while the 21.1 apply-join produces 1521. Apart from that, I think we may be performing more logging for apply-joins in 22.1 than 21.1. @yuzefovich any thoughts? The plans look about the same.

@yuzefovich
Copy link
Member

Hm, my guess is that with every release we just have more things in different pg internal tables (i.e. better PG compatibility), so just more rows satisfy the predicates, and thus the apply join has to run more times. Number of input rows to the apply join:

  • 21.1: 163
  • 22.1: 280
  • 22.2: 283.

It seems likely that (assuming there no correctness bugs) this increase in latency is expected because of this.

@ajwerner
Copy link
Contributor

ajwerner commented Oct 6, 2022

Coming back to this originally query, it seems that the apply join is doing something not quite right. I added some logging inside of col_description and it was getting called way way too many times. I added a constraint to the last join, replacing true with columns.oid = "table".oid and it seems to cut the runtime to like 2s from many minutes. This seems closer to correct, though the semantics of the query still seem a bit off.

SELECT jsonb_build_object(
        'oid',
        "table".oid::INT8,
        'columns',
        COALESCE(columns.info, '[]')
       )::JSONB AS info
  FROM pg_catalog.pg_class AS "table"
  JOIN pg_catalog.pg_namespace AS schema ON schema.oid = "table".relnamespace
       LEFT JOIN pg_catalog.pg_description AS description ON description.classoid
                                                                     = 'pg_catalog.pg_class'::REGCLASS
                                                                 AND description.objoid
                                                                     = "table".oid
                                                                 AND description.objsubid = 0
       LEFT JOIN LATERAL (
            SELECT "table".oid AS oid,
                   jsonb_agg(
                    jsonb_build_object(
                        'description',
                        pg_catalog.col_description("table".oid, "column".attnum)
                    )
                   ) AS info
              FROM pg_catalog.pg_attribute AS "column"
                   LEFT JOIN pg_catalog.pg_type AS type ON type.oid = "column".atttypid
                   LEFT JOIN pg_catalog.pg_type AS base_type ON type.typtype = 'd'
                                                                    AND base_type.oid
                                                                        = type.typbasetype
             WHERE "column".attrelid = "table".oid
               AND "column".attnum > 0
               AND NOT "column".attisdropped
        ) AS columns ON columns.oid = "table".oid
 WHERE "table".relkind IN ('r',);

@rafiss
Copy link
Collaborator

rafiss commented Jun 7, 2023

This is now taking ~2 seconds on an empty single node cluster. I will add a micro-benchmark for this, and close this.

@craig craig bot closed this as completed in cecaf53 Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants