Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/opt: incorrect query results with virtual table lookup semi join #91012

Closed
rafiss opened this issue Oct 31, 2022 · 4 comments · Fixed by #92713
Closed

sql/opt: incorrect query results with virtual table lookup semi join #91012

rafiss opened this issue Oct 31, 2022 · 4 comments · Fixed by #92713
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-vtables Virtual tables - pg_catalog, information_schema etc branch-master Failures and bugs on the master branch. branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team

Comments

@rafiss
Copy link
Collaborator

rafiss commented Oct 31, 2022

Describe the problem

There's a problem that seems to happen when a virtual table lookup semi join is planned. See below.

                              info
-----------------------------------------------------------------
  distribution: local
  vectorized: true

  • group (scalar)
  │
  └── • virtual table lookup join
      │ table: pg_class@pg_class_oid_idx
      │ equality: (attrelid) = (oid)
      │ pred: relname = 'b'
      │
      └── • virtual table lookup join (semi)
          │ table: pg_type@pg_type_oid_idx
          │ equality: (atttypid) = (oid)
          │ pred: typname = 'int8'
          │
          └── • filter
              │ filter: (NOT attnotnull) AND (attname = 'a_id')
              │
              └── • virtual table
                    table: pg_attribute@primary

To Reproduce

Start a fresh cluster, then:

-- setup
CREATE TABLE b (id INT, a_id INT);

root@localhost:26257/defaultdb> 
SELECT
  count(*)
FROM
  pg_class AS t INNER JOIN pg_attribute AS a ON t.oid = a.attrelid
WHERE
  a.attnotnull = 'f'
  AND a.attname = 'a_id'
  AND t.relname = 'b'
  AND a.atttypid
    IN (
        SELECT
          oid
        FROM
          pg_type
        WHERE
          typname = ANY (ARRAY['int8'])
      );
  count
---------
      0
(1 row)

If you force a hash join, then a count of 1 is returned.

root@localhost:26257/defaultdb> 
SELECT
  count(*)
FROM
  pg_class AS t
  INNER HASH JOIN pg_attribute AS a ON t.oid = a.attrelid
WHERE
  a.attnotnull = 'f'
  AND a.attname = 'a_id'
  AND t.relname = 'b'
  AND a.atttypid
    IN (
        SELECT
          oid
        FROM
          pg_type
        WHERE
          typname = ANY (ARRAY['int8'])
      );
  count
---------
      1
(1 row)

Expected behavior
The query should return a count of 1.

Environment:
Reproduced on v22.2.0-beta.2 and v22.1.8.

Jira issue: CRDB-21073
Epic: CRDB-23454

@rafiss rafiss added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Oct 31, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 31, 2022
@rafiss rafiss added the A-sql-vtables Virtual tables - pg_catalog, information_schema etc label Nov 1, 2022
@rafiss
Copy link
Collaborator Author

rafiss commented Nov 1, 2022

I tried to step through a bit with the debugger. I found that the inner join on 'b' is probably not getting planned correctly. In (*joinPredicate).eval we can see the following onCond.

image

But when you step in and look at how the condition is evaluated, it appears that the left-hand side is wrong. It's trying to compare 'b' to 'int8'.

image

@rafiss rafiss added branch-master Failures and bugs on the master branch. branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 A-sql-pgcompat Semantic compatibility with PostgreSQL labels Nov 1, 2022
@rafiss rafiss added the S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. label Nov 6, 2022
@knz
Copy link
Contributor

knz commented Nov 28, 2022

Is this related to #88096?

@mgartner
Copy link
Collaborator

Let's aim to fix this by ~Feb 2023 to unblock some CLI improvements for 23.1.

@rafiss
Copy link
Collaborator Author

rafiss commented Nov 29, 2022

This is blocking this PR: #90649. The PR requires an index on pg_namespace(oid), but adding that index triggers this bug and causes this test to fail:

# ActiveRecord 4.2.x query for checking if an index exists
# Relies on OID IN tuple support
query I
SELECT count(*)
FROM pg_class t
INNER JOIN pg_index d ON t.oid = d.indrelid
INNER JOIN pg_class i ON d.indexrelid = i.oid
WHERE i.relkind = 'i'
AND i.relname = 'b_idx'
AND t.relname = 'b'
AND i.relnamespace IN (SELECT oid FROM pg_namespace WHERE nspname = ANY (current_schemas(false)))
----
1

@yuzefovich yuzefovich self-assigned this Nov 30, 2022
@craig craig bot closed this as completed in e7b15eb Dec 2, 2022
craig bot pushed a commit that referenced this issue Feb 24, 2023
88061: clisqlshell: new infrastructure for describe commands r=rafiss,ZhouXing19 a=knz

Fixes #95320.
Epic: CRDB-23454

The SQL shell (`cockroach sql`, `demo`) now
supports the client-side commands `\l`, `\dn`, `\d`, `\di`, `\dm`,
`\ds`, `\dt`, `\dv`, `\dC`, `\dT`, `\dd`, `\dg`, `\du`, `\df` and `\dd` in a
way similar to `psql`, including the modifier flags `S` and `+`, for
convenience for users migrating from PostgreSQL.

A notable difference is that when a pattern argument is specified, it
should use the SQL "LIKE" syntax (with `%` representing the wildcard
character) instead of PostgreSQL's glob-like syntax (with `*`
representing wildcards).

Issues discovered:

- [x] join bug:  #88096
- [x] semi-join exec error #91012
- [x] `pg_table_is_visible` should return true when given a valid index OID and the index is valid.  #88097
- [x] missing pkey column in pg_index:  #88106
- [x] missing stored columns in pg_index: #88107 
- [x] pg_statistic_ext has problems #88108
- [x] missing view def on materialized views  #88109
- [x] missing schema comments: #88098
- [x] missing pronamespace for functions #94952
- [x] broken pg_function_is_visible for UDFs #94953
- [x] generated columns #92545
- [x] indnullsnotdistinct #92583
- [x] missing prokind #95288
- [x] missing function comments in obj_description #95292
- [x] planning regression #95633

96397: builtins: mark some pg_.* builtins as strict r=DrewKimball a=mgartner

Builtins defined using the UDF `Body` field will be wrapped in a `CASE`
expression if they are strict, i.e., `CalledOnNullInput=false`. When the
builtin is inlined, the `CASE` expression prevents decorrelation,
leaving a slow apply-join in the query plan. This caused a significant
regression of some ORM introspection queries.

Some of these builtins have filters that cause the SQL body to return no rows
if any of the arguments is NULL. In this case, the builtin will have the same
behavior whether or not it is defined as being strict. We can safely optimize
these builtins by setting `CalledOnNullInput=true`.

The following conditions are sufficient to prove that `CalledOnNullInput` can
be set for a builtin function with a SQL body:

  1. The WHERE clause of the SQL query *null-rejects* every argument of the
     builtin. Operators like `=` and `<` *null-reject* their operands because
     they filter rows for which an operand is NULL.

  2. The arguments are not used elsewhere in the query. This is not strictly
     necessary, but simplifies the proof because it ensures NULL arguments will
     not cause the builtin to error.

Examples of SQL statements that would allow `CalledOnNullInput` to be set:
```
SELECT * FROM tab WHERE $1=1 AND $2='two';

SELECT * FROM tab WHERE $1 > 0;
```

Fixes #96218
Fixes #95569

Epic: None

Release note: None


97585: cli: don't scope TLS client certs to a specific tenant by default r=stevendanna a=knz

Epic: CRDB-23559
Fixes: #97584

This commit changes the default for `--tenant-scope` from "only the system tenant" to "cert valid for all tenants".

Note that the scoping is generally useful for security, and it is used in CockroachCloud. However, CockroachCloud does not use our CLI code to generate certs and sets its cert tenant scopes on its own.

Given that our CLI code is provided for convenience and developer productivity, and we don't expect certs generated here to be used in multi-tenant deployments where tenants are adversarial to each other, defaulting to certs that are valid on every tenant is a good choice.

Release note: None


Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-vtables Virtual tables - pg_catalog, information_schema etc branch-master Failures and bugs on the master branch. branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants