Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_catalog.col_description is very slow #87947

Closed
Tracked by #8788
danieljharvey opened this issue Sep 14, 2022 · 8 comments · Fixed by #95090
Closed
Tracked by #8788

pg_catalog.col_description is very slow #87947

danieljharvey opened this issue Sep 14, 2022 · 8 comments · Fixed by #95090
Assignees
Labels
A-tools-hasura C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner

Comments

@danieljharvey
Copy link

danieljharvey commented Sep 14, 2022

Describe the problem

Hi! 👋

The pg_catalog.col_description function runs very slowly, making any queries that use it take ~20 seconds per invocation.

Reproduction

The following SQL query is used to fetch column descriptions for all tables:

WITH
  "tabletable" as ( SELECT "table".oid,
           "table".relkind,
           "table".relname AS "table_name",
           "schema".nspname AS "table_schema"
      FROM pg_catalog.pg_class "table"
      JOIN pg_catalog.pg_namespace "schema"
          ON schema.oid = "table".relnamespace
      WHERE "table".relkind IN ('r', 't', 'v', 'm', 'f', 'p')
        AND "schema".nspname NOT LIKE 'pg_%'
        AND "schema".nspname NOT IN ('information_schema', 'hdb_catalog', 'hdb_lib', '_timescaledb_internal', 'crdb_internal')
  )
SELECT
  "table".table_schema,
  "table".table_name,
  coalesce(columns.description, '[]') as columns
FROM "tabletable" "table"

LEFT JOIN LATERAL
  ( SELECT
      pg_catalog.col_description("table".oid, "column".attnum) as description
    FROM pg_catalog.pg_attribute "column"
    WHERE "column".attrelid = "table".oid
  ) columns ON true;

Run on a fresh start of the DB with no tables, it takes 25ms to run.

If I add a single table...

CREATE TABLE test (id INT);

then running the above query takes 21.732s.

If I comment out the pg_catalog.col_description line, so the end looks like this:

LEFT JOIN LATERAL
  ( SELECT
      '' as description
      -- pg_catalog.col_description("table".oid, "column".attnum) as description
    FROM pg_catalog.pg_attribute "column"
    WHERE "column".attrelid = "table".oid
  ) columns ON true;

Performance is fine again, and this takes 95ms to run.

Expected behavior

Would expect this function to perform much more efficiently.

Additional context

This function is needed for Hasura GraphQL Engine. We are currently working around this by not fetching column descriptions.

Epic CRDB-23454
Jira issue: CRDB-19609

@danieljharvey danieljharvey added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 14, 2022
@blathers-crl
Copy link

blathers-crl bot commented Sep 14, 2022

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

  • @cockroachdb/sql-experience (found keywords: pg_,Hasura)

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Sep 14, 2022
@rafiss
Copy link
Collaborator

rafiss commented Sep 16, 2022

Thanks for the report! The performance of this should improve in v22.2 - which version were you testing with when you saw this?

@danieljharvey
Copy link
Author

Hey, great news! My tests were with 22.1.

@rafiss rafiss added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Oct 3, 2022
@DrewKimball
Copy link
Collaborator

In the meantime on 22.1, if a rewrite is ok, you could try replacing "table".oid in pg_catalog.col_description with "column".attrelid. It's valid in this case because of the WHERE "column".attrelid = "table".oid filter

@rafiss
Copy link
Collaborator

rafiss commented Oct 6, 2022

Thanks @DrewKimball ! I tested with that suggestion and my test case (just running a single node locally, with one table in the cluster) dropped from 5.9 seconds to 31ms.

@rafiss
Copy link
Collaborator

rafiss commented Oct 27, 2022

@DrewKimball should we rename this issue to describe the problem with the join/filter? Or is there already another issue for that?

@DrewKimball
Copy link
Collaborator

I've got a separate issue for decorrelating cases like this one: #88885. I'm not sure if pg_catalog.col_description being slow is still a concern though.

@rafiss
Copy link
Collaborator

rafiss commented Oct 27, 2022

Ok great - we can keep this one about col_description the same then. It is still kind of slow, but it has been improved from the previous implementation and probably is "good enough" for what we're able to do right now. The higher priority would be #88885

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tools-hasura C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants