Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gen4: Add hash join primitive and planning #9140

Merged
merged 40 commits into from
Nov 22, 2021
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
f8c7a3d
first stab at hash join implementation
systay Nov 4, 2021
3dd0a40
addition of the HashJoin logical plan
frouioui Nov 4, 2021
aad4f3a
add predicates to the join engine description
GuptaManan100 Nov 9, 2021
b8619e0
bug fix in hash join
GuptaManan100 Nov 9, 2021
8825dc4
added test for hash join
GuptaManan100 Nov 9, 2021
719507b
added stream execute function for hash join primitive
GuptaManan100 Nov 9, 2021
531a285
Merge branch main into hash-join
systay Nov 9, 2021
308abc1
update test assertions
systay Nov 9, 2021
887167f
change type of hashcodes, and start testing
systay Nov 13, 2021
2cf173c
added toType to hashCode and randomized testing
systay Nov 15, 2021
c1ab897
feat: added comparison collation and type to the hash join primitive
frouioui Nov 15, 2021
10c00c6
feat: updated plan_tests with authoritative type on user/user_extra col
frouioui Nov 15, 2021
c24fd0a
Merge remote-tracking branch 'upstream/main' into hash-join
frouioui Nov 15, 2021
9c51420
test: update plan test output after merge
frouioui Nov 15, 2021
e5418aa
feat: check if text can be hashed using collation
frouioui Nov 15, 2021
da09524
feat: support for string values in NullsafeHashcode
frouioui Nov 15, 2021
7316a9f
feat: improve NullsafeCompare with the new coercion and cast functions
frouioui Nov 15, 2021
288abdc
refactor: renamed variables and added comments
GuptaManan100 Nov 16, 2021
1e38710
feat: support collations in distinct primitive comparisons
GuptaManan100 Nov 16, 2021
d69ab6f
test: added test for collation support in distinct primitive
GuptaManan100 Nov 16, 2021
329c80a
feat: use collation info when doing DISTINCT operations
systay Nov 16, 2021
fa592f3
refactor: clean up code and added comments
systay Nov 16, 2021
c6b3dac
refactor: clean up code and add comments
systay Nov 16, 2021
7104c38
doc: added TODOs for future improvements
systay Nov 16, 2021
5d8f46b
test: turn off test assertion
systay Nov 16, 2021
e6c47f1
test: make test find the offset by name
systay Nov 16, 2021
90fb862
Merge remote-tracking branch main into hash-join
systay Nov 16, 2021
b96ead0
feat: disallow aggregation on top of hash joins
systay Nov 16, 2021
24e2ae0
refactor: extract method and add comments to explain the constants
systay Nov 16, 2021
6b40696
feat: addition of HashJoinDirective to parse ALLOW_HASH_JOIN directives
frouioui Nov 17, 2021
e81cf5d
feat: use the ALLOW_HASH_JOIN hint in the planner to plan hash joins
frouioui Nov 17, 2021
3718f8a
test: added a new test for the HashJoin engine primitive
frouioui Nov 17, 2021
63181f3
Merge branch main into hash-join
systay Nov 17, 2021
319f937
evalengine: use ParseFloatPrefix for parsing floats
vmg Nov 17, 2021
08b39ab
evalengine: do not allocate when parsing
vmg Nov 17, 2021
45ea9b2
refactor: clean up code after code review
systay Nov 17, 2021
c64fb98
test: add more queries instead of changing queries to use hash joins
systay Nov 17, 2021
cba1975
test: use the hash join hint in the end to end test
systay Nov 22, 2021
51b97da
feat: turn off the cost check before using hash joins. we'll just rel…
systay Nov 22, 2021
ccee93d
refactor: move collation check inside canHashJoin function
systay Nov 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions go/mysql/collations/8bit.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ func (c *Collation_8bit_bin) WeightString(dst, src []byte, numCodepoints int) []
return weightStringPadingSimple(' ', dst, numCodepoints-copyCodepoints, padToMax)
}

func (c *Collation_8bit_bin) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_8bit_bin) Hash(src []byte, numCodepoints int) HashCode {
hash := 0x8b8b0000 | uintptr(c.id)
if numCodepoints == 0 {
return memhash(src, hash)
Expand Down Expand Up @@ -164,7 +164,7 @@ func (c *Collation_8bit_simple_ci) WeightString(dst, src []byte, numCodepoints i
return weightStringPadingSimple(' ', dst, numCodepoints-copyCodepoints, padToMax)
}

func (c *Collation_8bit_simple_ci) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_8bit_simple_ci) Hash(src []byte, numCodepoints int) HashCode {
sortOrder := c.sort

var tocopy = len(src)
Expand Down Expand Up @@ -251,7 +251,7 @@ func (c *Collation_binary) WeightString(dst, src []byte, numCodepoints int) []by
return dst
}

func (c *Collation_binary) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_binary) Hash(src []byte, numCodepoints int) HashCode {
if numCodepoints > 0 {
src = src[:numCodepoints]
}
Expand Down
4 changes: 3 additions & 1 deletion go/mysql/collations/collation.go
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ type Collation interface {
// the hash will interpret the source string as if it were stored in a `CHAR(n)` column. If the value of
// numCodepoints is 0, this is equivalent to setting `numCodepoints = RuneCount(src)`.
// For collations with NO PAD, the numCodepoint argument is ignored.
Hash(src []byte, numCodepoints int) uintptr
Hash(src []byte, numCodepoints int) HashCode

// Charset returns the Charset with which this collation is encoded
Charset() charset.Charset
Expand All @@ -128,6 +128,8 @@ type Collation interface {
IsBinary() bool
}

type HashCode = uintptr

const PadToMax = math.MaxInt32

func minInt(i1, i2 int) int {
Expand Down
2 changes: 1 addition & 1 deletion go/mysql/collations/multibyte.go
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ func (c *Collation_multibyte) WeightString(dst, src []byte, numCodepoints int) [
return dst
}

func (c *Collation_multibyte) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_multibyte) Hash(src []byte, numCodepoints int) HashCode {
cs := c.charset
sortOrder := c.sort

Expand Down
2 changes: 1 addition & 1 deletion go/mysql/collations/remote/collation.go
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ func (c *Collation) WeightString(dst, src []byte, numCodepoints int) []byte {
return dst
}

func (c *Collation) Hash(_ []byte, _ int) uintptr {
func (c *Collation) Hash(_ []byte, _ int) collations.HashCode {
panic("unsupported: Hash for remote collations")
}

Expand Down
6 changes: 3 additions & 3 deletions go/mysql/collations/uca.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ performPadding:
return dst
}

func (c *Collation_utf8mb4_uca_0900) Hash(src []byte, _ int) uintptr {
func (c *Collation_utf8mb4_uca_0900) Hash(src []byte, _ int) HashCode {
var hash = uintptr(c.id)

it := c.uca.Iterator(src)
Expand Down Expand Up @@ -234,7 +234,7 @@ func (c *Collation_utf8mb4_0900_bin) WeightString(dst, src []byte, numCodepoints
return dst
}

func (c *Collation_utf8mb4_0900_bin) Hash(src []byte, _ int) uintptr {
func (c *Collation_utf8mb4_0900_bin) Hash(src []byte, _ int) HashCode {
return memhash(src, 0xb900b900)
}

Expand Down Expand Up @@ -340,7 +340,7 @@ func (c *Collation_uca_legacy) WeightString(dst, src []byte, numCodepoints int)
return dst
}

func (c *Collation_uca_legacy) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_uca_legacy) Hash(src []byte, numCodepoints int) HashCode {
it := c.uca.Iterator(src)
defer it.Done()

Expand Down
4 changes: 2 additions & 2 deletions go/mysql/collations/unicode.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ func (c *Collation_unicode_general_ci) WeightString(dst, src []byte, numCodepoin
return dst
}

func (c *Collation_unicode_general_ci) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_unicode_general_ci) Hash(src []byte, numCodepoints int) HashCode {
unicaseInfo := c.unicase
cs := c.charset

Expand Down Expand Up @@ -278,7 +278,7 @@ func (c *Collation_unicode_bin) weightStringUnicode(dst, src []byte, numCodepoin
return dst
}

func (c *Collation_unicode_bin) Hash(src []byte, numCodepoints int) uintptr {
func (c *Collation_unicode_bin) Hash(src []byte, numCodepoints int) HashCode {
if c.charset.SupportsSupplementaryChars() {
return c.hashUnicode(src, numCodepoints)
}
Expand Down
5 changes: 5 additions & 0 deletions go/sqltypes/type.go
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,11 @@ func IsDate(t querypb.Type) bool {
return t == Datetime || t == Date || t == Timestamp || t == Time
}

// IsNull returns true if the type is NULL type
func IsNull(t querypb.Type) bool {
systay marked this conversation as resolved.
Show resolved Hide resolved
return t == Null
}

// Vitess data types. These are idiomatically
// named synonyms for the querypb.Type values.
// Although these constants are interchangeable,
Expand Down
10 changes: 5 additions & 5 deletions go/test/endtoend/vtgate/gen4/column_name_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ import (
"context"
"testing"

"vitess.io/vitess/go/test/endtoend/vtgate/utils"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

Expand All @@ -35,12 +37,10 @@ func TestColumnNames(t *testing.T) {
require.NoError(t, err)
defer conn.Close()

_, err = exec(t, conn, "create table uks.t2(id bigint,phone bigint,msg varchar(100),primary key(id)) Engine=InnoDB")
require.NoError(t, err)
defer exec(t, conn, "drop table uks.t2")
utils.Exec(t, conn, "create table uks.t2(id bigint,phone bigint,msg varchar(100),primary key(id)) Engine=InnoDB")
defer utils.Exec(t, conn, "drop table uks.t2")

qr, err := exec(t, conn, "SELECT t1.id as t1id, t2.id as t2id, t2.phone as t2phn FROM ks.t1 cross join uks.t2 where t1.id = t2.id ORDER BY t2.phone")
require.NoError(t, err)
qr := utils.Exec(t, conn, "SELECT t1.id as t1id, t2.id as t2id, t2.phone as t2phn FROM ks.t1 cross join uks.t2 where t1.id = t2.id ORDER BY t2.phone")

assert.Equal(t, 3, len(qr.Fields))
assert.Equal(t, "t1id", qr.Fields[0].Name)
Expand Down
Loading