Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add rule to replace outer cols with equivalent non-outer cols #88885

Closed
DrewKimball opened this issue Sep 28, 2022 · 0 comments · Fixed by #92378
Closed

opt: add rule to replace outer cols with equivalent non-outer cols #88885

DrewKimball opened this issue Sep 28, 2022 · 0 comments · Fixed by #92378
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@DrewKimball
Copy link
Collaborator

DrewKimball commented Sep 28, 2022

There have been a few customer issues with long-running correlated queries that could have been decorrelated with the following decorrelation rule: for correlated Select and Join operators that have a filter that sets an outer column equal to a non-outer column, replace all references to the outer column with the equivalent non-outer column. This rule should be valid even for left joins, since any right row for which the replacement does not hold will not become part of the output.

Jira issue: CRDB-20024

@DrewKimball DrewKimball added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Sep 28, 2022
@DrewKimball DrewKimball self-assigned this Sep 28, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 28, 2022
craig bot pushed a commit that referenced this issue Oct 27, 2022
90780: rttanalysis: add benchmark for hasura's introspection query r=rafiss a=rafiss

informs #88885

This query exposes an issue with how we construct filters for certain types of joins. It can be used to verify future enhancements.

Release note: None

90788: kvserver: improve help text for `admission.io.overload` r=irfansharif a=irfansharif

Since this is text that's available in any running binary (`SHOW ALL CLUSTER SETTINGS`), we shouldn't expect readers to have the source code checked out, which makes our referencing of specific file names less than useful. See #87656 (review).

Release note: None
Release justification: No behavioural change, only changing user-visible help text.

90792: ui: address an old TODO r=yuzefovich a=yuzefovich

This commit addresses an old TODO to simplify the check for when we warn about full scans.

Epic: None

Release note: None

90802: logictest: check for nil SystemConfig before purging zone config cache r=yuzefovich a=msirek

Fixes #88398

This fixes a panic which can occur duing cluster startup in a logic test or during retry of a test case in a logic test when an attempt is made to purge the zone config cache and no SystemConfig is available.

Release note: None

90804: release: S3 redirects should use abs path r=celiala a=rail

Previously, S3 "latest" redirects used relative path, while the AWS API expects the redirects to start with "/", "http://", or "https://".

This patch adds a slash to the key name for all redirect calls. The GCS implementation already strips the leading slash and should handle it properly.

Release note: None
Epic: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: irfan sharif <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Mark Sirek <[email protected]>
Co-authored-by: Rail Aliiev <[email protected]>
craig bot pushed a commit that referenced this issue Nov 4, 2022
89584: cloud: add read and write metrics r=adityamaru a=adityamaru

This change adds cloud.read_bytes and cloud.write_bytes metrics.
These are updated on all read/write operations to external storage endpoints.

These metrics while shared by all external storage interactions
will provide an immediate signal into whether we're seeing an
abnormal rate of reads and writes errors during support investigations.

Fixes: #89242

Release note: None

91205: descidgen: use system.descriptor_id_seq for all tenants r=postamar a=postamar

Previously, the system tenant used a legacy non-SQL key in the meta range to generate descriptor IDs, while non-system tenants used the system.descriptor_id_seq sequence.

This commit introduces a cluster upgrade step which creates this sequence in the the system tenant's system database and migrates the legacy counter value to it.

While this migration is underway, the descriptor ID generator is not available. This upgrade consists of a relatively brief transaction and descriptor ID generation is a relatively infrequent event, as a result the disruption is deemed tolerable.

Fixes #80294.

Release note (sql change): upgrading a cluster to a major release containing this patch will endow the system tenant's system database with a descriptor_id_seq sequence which will henceforth be used to generate new descriptor IDs, as is already the case for non-system tenants.

91225: rttanalysis: skip slow test case r=ajwerner a=rafiss

see also #88885

Release note: None

91231: storage: remove legacy SST iterator r=nicktrav a=jbowens

The legacy sstable iterator is no longer in use. Removing the legacy sstable iterator now also avoids updating it to accommodate the new Pebble internal iterator interface with lazy value fetching.

Epic: None
Release note: None

91245: sql: include all schemas into schema.sql in the bundle r=yuzefovich a=yuzefovich

This commit makes it so that all schemas are added into the `schema.sql` file of the statement bundle using `SHOW CREATE ALL SCHEMAS;` statement (the only exception is the public schema that always exists in CRDB clusters). This will make it more likely that `debug statement-bundle recreate` succeeds on the bundles.

Fixes: #91099.

Epic: None

Release note: None

91255: ui: fixes on sql activity filter r=maryliag a=maryliag

This commit fixes the label of Transaction
filter and add the proper color for the filter
styles (previously was being loaded as a different color on the Sessions page because it was being inherit from another component)

Fixes #91206

Transactions Before
<img width="457" alt="Screen Shot 2022-11-03 at 7 07 13 PM" src="https://user-images.githubusercontent.com/1017486/199851829-b747102a-ac70-4720-97b5-7911b742345c.png">


Transactions After
<img width="477" alt="Screen Shot 2022-11-03 at 6 55 14 PM" src="https://user-images.githubusercontent.com/1017486/199851704-71641bb6-8247-4d92-8f2d-3efd27846594.png">


Sessions Before
<img width="317" alt="Screen Shot 2022-11-03 at 6 36 51 PM" src="https://user-images.githubusercontent.com/1017486/199851596-313646b9-0188-48aa-9392-7dd7f3eded99.png">


Sessions After
<img width="309" alt="Screen Shot 2022-11-03 at 6 55 06 PM" src="https://user-images.githubusercontent.com/1017486/199851671-02d4db08-4f2b-4ac3-95ff-e1d98e2d6ede.png">


Before/After of Statement page continues the same
<img width="604" alt="Screen Shot 2022-11-03 at 6 55 28 PM" src="https://user-images.githubusercontent.com/1017486/199851734-06a8a131-3528-4d17-a856-9d39d67e7017.png">


Release note (ui change): Fix Transaction filter label on SQL Activity page.

91260: sql: add crdb_internal.descriptor_with_post_deserialization_changes r=postamar a=postamar

This builtin applies post-deserialization changes to descriptors. This is useful during descriptor surgery as well as, potentially, during cluster upgrade steps.

Fixes #90269.

Release note: None

91289: tracing: use timeutil for Now() and Since() r=yuzefovich a=yuzefovich

Previously, for performance reasons (as well as a few others) we used
`time.Now()` and `time.Since` in the tracing library. But the work in
f782e45 alleviated those concerns, so
we can now switch to using `timeutil` library in two places.
```
Tracer_StartSpanCtx/opts=none-24                                    508ns ± 0%     512ns ± 0%  +0.80%  (p=0.000 n=10+10)
Tracer_StartSpanCtx/opts=real-24                                    510ns ± 1%     515ns ± 1%  +0.96%  (p=0.000 n=10+10)
Tracer_StartSpanCtx/opts=real,logtag-24                             539ns ± 1%     544ns ± 1%  +0.90%  (p=0.000 n=10+10)
Tracer_StartSpanCtx/opts=real,autoparent-24                         480ns ± 0%     486ns ± 1%  +1.26%  (p=0.000 n=10+10)
Tracer_StartSpanCtx/opts=real,manualparent-24                       523ns ± 1%     527ns ± 1%  +0.91%  (p=0.004 n=10+10)
Tracer_StartSpanCtx/opts=real,autoparent,withEventListener-24       755ns ± 0%     765ns ± 0%  +1.40%  (p=0.000 n=10+10)
Tracer_StartSpanCtx/opts=real,manualparent,withEventListener-24     547ns ± 1%     552ns ± 1%  +0.89%  (p=0.000 n=10+10)
Span_GetRecording/root-only-24                                     7.36ns ± 0%    6.75ns ± 0%  -8.26%  (p=0.000 n=10+10)
Span_GetRecording/child-only-24                                    7.36ns ± 0%    6.75ns ± 0%  -8.30%  (p=0.000 n=10+10)
Span_GetRecording/root-child-24                                    7.36ns ± 0%    6.74ns ± 0%  -8.32%  (p=0.000 n=10+10)
RecordingWithStructuredEvent/with-event-listener-24                3.47µs ± 1%    3.47µs ± 1%    ~     (p=0.541 n=10+10)
RecordingWithStructuredEvent/without-event-listener-24             3.31µs ± 0%    3.32µs ± 1%    ~     (p=0.271 n=10+10)
SpanCreation/detached-child=false-24                               1.07µs ± 1%    1.07µs ± 1%    ~     (p=0.955 n=10+10)
SpanCreation/detached-child=true-24                                1.84µs ± 1%    1.82µs ± 1%  -1.30%  (p=0.000 n=10+10)
```
And no changes in allocations.

Epic: None

Release note: None

91292: sql,delegate: cleanup from PR to improve performance of vtables with many databases r=ajwerner a=ajwerner

This is fallout from #91054 (review). See individual commits.

Epic: None

Release note: None

Co-authored-by: adityamaru <[email protected]>
Co-authored-by: Marius Posta <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 23, 2022
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Nov 30, 2022
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Dec 1, 2022
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
ajwerner added a commit to ajwerner/cockroach that referenced this issue Dec 2, 2022
This test ends up not seeing all the round trips because the trace is way too
large. Given that, we get flakes sometimes. See the referenced issue (cockroachdb#88885).

Fixes cockroachdb#92770

Release note: None
craig bot pushed a commit that referenced this issue Dec 2, 2022
88219: {bazci,dev}: send build events to beaver hub r=rickystewart a=healthy-pod

This code change lets `bazci` and `dev` send build events to beaver hub so
we can start collecting data about builds in CI and locally.

Release note: None
Epic: [CRDB-8350](https://cockroachlabs.atlassian.net/browse/CRDB-8350)

92263: sql,cli: add redacted sql stmts to debug zip r=xinhaoz a=xinhaoz

## Commit 1

This commit adds the builtin, `crdb_internal.anonymize_sql_constants`
which takes in a sql string and returns it with  constants redacted.
This will be used to redact columns that are sql stmts in the
redacted debug zip.

Release note: None

## Commit 2
Closes #88823

This commit adds the following fields to the redacted
debug zip:

crdb_internal.create_statements:
- create_statement
- - create_nofks
- alter_statements (each elem is redacted)

crdb_internal.create_function_statements:
- create_statement

crdb_internal.{node,cluster}_distsql_flows:
- stmt

crdb_internal.{cluster,node}_sessions:
- last_active
- active_queries

crdb_internal.{cluster,node}_queries:
- query

Release note (cli change):
The following fields have been redacted and added to
the redacted debug zip:
crdb_internal.create_statements:
- create_statement
- create_nofks
- alter_statements (each elem is redacted)

crdb_internal.create_function_statements:
- create_statement
- 
crdb_internal.{node,cluster}_distsql_flows:
- stmt

crdb_internal.{cluster,node}_sessions:
- last_active
- active_queries

crdb_internal.{cluster,node}_queries:
- query



-----------
Running ycsb, tpcc and movr default workloads for 15 minutes and requesting the debug zip on a fresh node in master vs with new changes:
master
<img width="927" alt="Pasted Graphic" src="https://user-images.githubusercontent.com/20136951/203359099-6a035d1c-00fe-4bbb-92c2-d1c2b2a3b706.png">

branch
<img width="978" alt="Pasted Graphic 2" src="https://user-images.githubusercontent.com/20136951/203359158-a8b02ccb-2d82-40b4-a33d-9faf6ddaab70.png">


92627: leaktest, stop: don't recover from panics r=andreimatei a=andreimatei

See individual commits.

Epic: None

92793: roachtest: update activerecord blocklist and ignore list r=ZhouXing19,rafiss a=andyyang890

This patch updates the blocklist and ignore list for the
activerecord roachtest. This patch also eliminates the
per-version lists since each release branch uses their
own list now.

Fixes #84955

Release note: None

92856: log: fix flakiness in TestStatusLogRedaction r=dhartunian a=abarganier

TestStatusLogRedaction was continuously failing when run with `--race`.

The test searches for a specific log line in the results from two separate RPCs - `LogFile()` and `Log()`. The test consistently failed when checking the results from the `Log()` RPC because the endpoint has a default max number of log entries set to 1000. Frequently, the log entry that we're searching for has a line index in the range of [1500, 2000], so the test was failing to find the log entry since it was being filtered out due to the default max number of entries.

This patch simply sets the max to a high enough value where this should no longer happen. With multiple runs I never saw the total # of log lines exceed 2,500, so we set the max to 5,000 to be safe.

Release note: None

Fixes: #92789

92929: rttanalysis: skip test which doesn't work due to tracing limitations r=ajwerner a=ajwerner

This test ends up not seeing all the round trips because the trace is way too large. Given that, we get flakes sometimes. See the referenced issue (#88885).

Fixes #92770

Release note: None

92934: clisqlshell: fix a panic on tab key r=rail a=knz

Fixes  #92935.

This patch ensures that there's no visible panic if the user presses "tab" and there's no completion available.

NB: no unit tests here - the test coverage will be delivered by #87606.

92942: kvserver: remove the assertion on not exceeding MaxSpanRequestKeys r=yuzefovich a=yuzefovich

This commit fixes the bug in a recently merged commit where we allowed to exceed `MaxSpanRequestKeys` in some cases - we forgot to update the remaining limit to be "exhausted". Furthermore, this commit removes the assertion altogether.

Epic: None

Release note: None

92953: ui: fix link for index from insights r=maryliag a=maryliag

Previously, the link to index details on the
drop index insights was using the URL format used by DB Console only. This commit updates to use the correct format when loading from CC Console.

Fix #92944

Release note (bug fix): Fix link to index details on drop index insights on CC Console.

Co-authored-by: healthy-pod <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Andy Yang <[email protected]>
Co-authored-by: Alex Barganier <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: maryliag <[email protected]>
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Dec 2, 2022
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Jan 25, 2023
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Jan 26, 2023
This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`,
`TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules
match joins and selects that have a correlated input and an equality
filter between an outer and a non-outer column from the input. Upon
matching, the rules traverse the input as far as it would be valid to
push the equality filter through normal push-down rules, and replace
any encountered references to the equivalent outer column. This approach
avoids interactions with rules like `TryDecorrelateSelect`, which attempt
to pull filters *up* the operator tree when correlation is present.

Fixes cockroachdb#88885

Release note: None
craig bot pushed a commit that referenced this issue Jan 27, 2023
92378: opt: add rule to replace outer cols with equivalent non-outer cols r=DrewKimball a=DrewKimball

This commit adds three decorrelation rules, `TryRemapJoinOuterColsRight`, `TryRemapJoinOuterColsLeft`, and `TryRemapSelectOuterCols`. These rules match joins and selects that have a correlated input and an equality filter between an outer and a non-outer column from the input. Upon matching, the rules traverse the input as far as it would be valid to push the equality filter through normal push-down rules, and replace any encountered references to the equivalent outer column. This approach avoids interactions with rules like `TryDecorrelateSelect`, which attempt to pull filters *up* the operator tree when correlation is present.

Fixes #88885

Release note: None

95620: sql: wait for one version lease if no job created for a descriptor r=chengxiong-ruan a=chengxiong-ruan

Previously we only wait for one version in jobs for descriptors performed with schema changes. The problem is that we don't always create jobs for all descriptors modified. We need to wait for one version for these descriptors too. This pr adds logic to wait if there is no jobs created for a descriptor has new version.

Fixes: #90600

Release note: None

95800: instancestorage: read instances in a transaction r=ajwerner a=healthy-pod

This code change adds `(instancestorage.Reader).GetAllInstancesUsingTxn` to read all instances using the given `*kv.Txn`.

Release note: None
Epic: CRDB-18735

96004: sql/schemachanger: limit which op/dep rule helpers are shared r=fqazi a=fqazi

Previously, the rules helpers were generally shared between releases inside the declarative schema changer. This could be problematic, since some helpers may get modified as new elements get added between releases. To address this, this patch moves most helpers back into individual rules, leaving a small subset as shared.

Additionally, this patch version gates TypeFilter, so that only elements for a given release are visible. When comparing 22.2 versus the compatibility release the majority of differences are now linked with sorting changes and some constraint related adjustments (element renames and minor rules adjustment for ops).

Epic: none
Release note: None

96021: storage: Make logging event listener async for DiskSlow r=sumeerbhola a=itsbilal

The pebble logger could block if we're experiencing a slow / stalling disk. If the call to the pebble logger is synchronous from the EventListener passed into Pebble, it could end up slowing down Pebble's internal disk health checks as those rely on EventListener methods being quick to run.

This change updates the logging event listener to asynchronously call the logger on a DiskSlow event.

Related to #94373.

Epic: none

Release note: None.

Co-authored-by: Drew Kimball <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: healthy-pod <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
@craig craig bot closed this as completed in eb935d9 Jan 27, 2023
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant