-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt/props: use EquivGroups in FuncDepSet for tracking equivalences #137571
Conversation
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Continuation of #117272. First commit is #137558. Here's an updated slow-queries benchmark against master:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r2, 3 of 3 files at r3, 4 of 4 files at r4, 60 of 60 files at r5, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/opt/props/equiv_set.go
line 37 at r4 (raw file):
if buildutil.CrdbTestBuild { defer eq.verify() }
nit: Methods that don't mutate the EquivGroup probably don't need to verify the set, do they? Or is this to catch cases where the user incorrectly mutates a ColSet returned from e.g. Group
without copying it? In that case, it might be worth a comment mentioning that—Maybe a single comment on verify
instead of comments throughout? Up to you.
pkg/sql/opt/props/equiv_set.go
line 62 at r4 (raw file):
// GroupForCol returns the group of columns equivalent to the given column. It // returns the empty set if no such group exists. The returned should not be
nit: "The returned should.." => "The returned ColSet should..."
pkg/sql/opt/props/equiv_set.go
line 103 at r4 (raw file):
} if eq.groups[i].Contains(right) { return eq.groups[i].Contains(left)
nit: This can be return false
since we already checked if the group contains left
.
pkg/sql/opt/props/equiv_set.go
line 251 at r4 (raw file):
// * In the example, the (1-3) group is split into (1) and (2,3). Since the (1) // group only has a single column, it is discarded as a trivial equivalence. // * The (4-8) group is split into (4,8) and (5,6). Since both subsets have at
nit: (4,8)
=> (4,7,8)
pkg/sql/opt/props/equiv_set.go
line 286 at r4 (raw file):
// AppendFromDisjoint unions the equiv groups from the given EquivGroups with // this one, assuming the groups are disjoint. func (eq *EquivGroups) AppendFromDisjoint(other *EquivGroups) {
nit: make the comment louder about the fact that the groups must be disjoint.
pkg/sql/opt/props/func_dep.go
line 648 at r5 (raw file):
for i, ok := cols.Next(0); ok; i, ok = cols.Next(i + 1) { // First check if the column is present in any "to" set of a dependency. // If not, then it is not redundant and must remain in the set. This is
nit: Update this comment to mention the equiv groups
pkg/sql/opt/props/func_dep.go
line 1690 at r5 (raw file):
needComma = true from := opt.MakeColSet(col) fmt.Fprintf(b, "%s==%s", from, group.Difference(from))
This verbose formatting of equiv sets is now vestigial. I think it's good to not create extra test output churn in this commit, but maybe we should consider a new format for equivalencies in the near future, like:
fd: (1)-->(2,3), ==(4,5), ==(9-11)
pkg/sql/opt/props/func_dep.go
line 1821 at r5 (raw file):
// Non-constant FDs are weaker than equivalence constraints. if f.equiv.AreAllColsEquiv(to.Union(from)) {
Something to consider if we see a lot of allocations here: it may be helpful to have a 2-set version of AreAllColsEquiv
to avoid Union
.
pkg/sql/opt/props/func_dep.go
line 1889 at r5 (raw file):
break } }
nit: In a subsequent PR/commit, consider returning the added equiv colset from AddNoCopy
so we don't have to search for it here.
pkg/sql/opt/props/equiv_set_test.go
line 73 at r3 (raw file):
} var equivSet EquivGroups
nit: rename equivSet
in the renaming commit.
pkg/sql/opt/props/equiv_set.go
line 187 at r5 (raw file):
} for i := range fdset.equiv.groups { // No copy is necessary because the equiv groups are immutable.
Is it really "necessary", or just "safe"?
This commit renames `EquivSet` to `EquivGroups` to better reflect the fact that it tracks _groups_ of equivalent groups, rather than a single set of equivalent columns. Epic: None Release note: None
This commit removes the inline buffer from `EquivGroups`, as well as the `NewEquivGroups()` that was previously used to set up the buffer. This simplifies usage, and doesn't appear to affect performance when the set is used in `JoinOrderBuilder` or in `FuncDepSet`. Informs cockroachdb#83963 Release note: None
This commit adds several new methods along with unit tests to `EquivGroups` to prepare its use in tracking equivalencies in `FuncDepSet`. Informs cockroachdb#83963 Release note: None
695c02a
to
b784dfe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt/props/equiv_set.go
line 37 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Methods that don't mutate the EquivGroup probably don't need to verify the set, do they? Or is this to catch cases where the user incorrectly mutates a ColSet returned from e.g.
Group
without copying it? In that case, it might be worth a comment mentioning that—Maybe a single comment onverify
instead of comments throughout? Up to you.
Yes, that's exactly why. Done.
pkg/sql/opt/props/equiv_set.go
line 62 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: "The returned should.." => "The returned ColSet should..."
Done.
pkg/sql/opt/props/equiv_set.go
line 103 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: This can be
return false
since we already checked if the group containsleft
.
Oh, good point. I ended up restructuring it to check containsLeft || containsRight
, and then just return containsLeft && containsRight
.
pkg/sql/opt/props/equiv_set.go
line 251 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit:
(4,8)
=>(4,7,8)
Done. Good catch
pkg/sql/opt/props/equiv_set.go
line 286 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: make the comment louder about the fact that the groups must be disjoint.
Done.
pkg/sql/opt/props/equiv_set.go
line 187 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Is it really "necessary", or just "safe"?
Confusing phrasing. I just meant we don't have to copy in order to be safe. I changed the wording a bit.
pkg/sql/opt/props/func_dep.go
line 648 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Update this comment to mention the equiv groups
Done.
pkg/sql/opt/props/func_dep.go
line 1690 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This verbose formatting of equiv sets is now vestigial. I think it's good to not create extra test output churn in this commit, but maybe we should consider a new format for equivalencies in the near future, like:
fd: (1)-->(2,3), ==(4,5), ==(9-11)
Yeah, I like that. Or maybe like this:
fd: (1)-->(2,3) equiv: (4,5), (9-11)
pkg/sql/opt/props/func_dep.go
line 1821 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Something to consider if we see a lot of allocations here: it may be helpful to have a 2-set version of
AreAllColsEquiv
to avoidUnion
.
Good idea. I wonder if there are any other places where that could help. Do you think it's worth leaving a TODO?
pkg/sql/opt/props/func_dep.go
line 1889 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: In a subsequent PR/commit, consider returning the added equiv colset from
AddNoCopy
so we don't have to search for it here.
Good idea. I added it to the commit that introduces the new methods and tests.
pkg/sql/opt/props/equiv_set_test.go
line 73 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: rename
equivSet
in the renaming commit.
Done.
b784dfe
to
3aff919
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r7, 3 of 4 files at r8, 63 of 63 files at r9, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/opt/props/func_dep.go
line 1821 at r5 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Good idea. I wonder if there are any other places where that could help. Do you think it's worth leaving a TODO?
Probably not necessary.
Previously, `FuncDepSet` was fairly wasteful in how it tracked sets of equivalent columns: for each column in the equiv group, an FD was maintained from that column to all other columns in the group. This meant that there were `2n` `ColSets` for each equiv group (where `n` is the number of columns in the group). This patch modifies `FuncDepSet` and its internals to use `props.EquivGroups` instead, which keeps a single `ColSet` for each equiv group. This significantly cuts down on allocations for queries with many columns and equalities, both because less `ColSets` spill to heap, and because less FDs are added to the `deps` slice. Fixes cockroachdb#83963 Release note: None
3aff919
to
15b7d7e
Compare
TFTR! bors r+ |
props: don't leave unmerged equiv groups in EquivSet
This commit fixes a bug in
EquivSet.tryMergeGroups
, which could causean invariant (that equiv groups are non-intersecting) to be violated.
This would potentially cause re-ordered joins to have redundant equality
filters, and prevent join filter push-down in rare cases. This commit
also adds verification to the set for test builds, and a unit test for
the
Add
method.Fixes #137381
Release note: None
props: rename EquivSet to EquivGroups
This commit renames
EquivSet
toEquivGroups
to better reflect thefact that it tracks groups of equivalent groups, rather than a single
set of equivalent columns.
Epic: None
Release note: None
props: remove initial buffer from EquivGroups
This commit removes the inline buffer from
EquivGroups
, as well as theNewEquivGroups()
that was previously used to set up the buffer. Thissimplifies usage, and doesn't appear to affect performance when the
set is used in
JoinOrderBuilder
or inFuncDepSet
.Informs #83963
Release note: None
props: add methods to EquivGroups for use in FuncDepSet
This commit adds several new methods along with unit tests to
EquivGroups
to prepare its use in tracking equivalencies in
FuncDepSet
.Informs #83963
Release note: None
opt/props: use EquivGroups in FuncDepSet for tracking equivalences
Previously,
FuncDepSet
was fairly wasteful in how it tracked sets ofequivalent columns: for each column in the equiv group, an FD was maintained
from that column to all other columns in the group. This meant that there
were
2n
ColSets
for each equiv group (wheren
is the number of columnsin the group).
This patch modifies
FuncDepSet
and its internals to useprops.EquivGroups
instead, which keeps a single
ColSet
for each equiv group. This significantlycuts down on allocations for queries with many columns and equalities, both
because less
ColSets
spill to heap, and because less FDs are added tothe
deps
slice.Fixes #83963
Release note: None