Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SELECT DISTINCT queries support #264

Merged
merged 10 commits into from
Oct 31, 2020

Conversation

tdakkota
Copy link
Contributor

Adds new DISTINCT keyword
Uses hash set to make projection result unique.

Copy link
Collaborator

@asdine asdine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

sql/planner/projection.go Outdated Show resolved Hide resolved
sql/query/select_test.go Outdated Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Oct 23, 2020

Codecov Report

Merging #264 into master will increase coverage by 0.20%.
The diff coverage is 43.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #264      +/-   ##
==========================================
+ Coverage   61.32%   61.53%   +0.20%     
==========================================
  Files          68       71       +3     
  Lines        6345     6574     +229     
==========================================
+ Hits         3891     4045     +154     
- Misses       1938     1996      +58     
- Partials      516      533      +17     
Impacted Files Coverage Δ
sql/planner/hash_set.go 0.00% <0.00%> (ø)
sql/planner/tree.go 37.57% <ø> (ø)
sql/scanner/token.go 22.22% <ø> (ø)
sql/parser/select.go 62.59% <44.44%> (-1.34%) ⬇️
sql/planner/distinct.go 61.53% <61.53%> (ø)
sql/planner/optimizer.go 82.24% <70.96%> (-1.92%) ⬇️
sql/query/alter.go 33.33% <0.00%> (-26.67%) ⬇️
database/config.go 67.85% <0.00%> (-0.74%) ⬇️
sql/parser/expr.go 77.12% <0.00%> (-0.69%) ⬇️
database/table.go 56.70% <0.00%> (-0.55%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5afd65...fc697f0. Read the comment docs.

Comment on lines 27 to 44
func (s documentHashSet) generateKey(d document.Document) (uint64, error) {
defer s.hash.Reset()

err := d.Iterate(func(field string, value document.Value) error {
var buf []byte
buf, err := key.AppendValue(buf, value)
if err != nil {
return err
}

_, err = s.hash.Write(buf)
if err != nil {
return err
}

buf = buf[0:]
return nil
})
if err != nil {
return 0, err
}

return s.hash.Sum64(), nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function relies on Iterate function implementation, so if it does not guarantee order of field iteration (for example map-based) then hash will be incorrect.
For now, Genji does not have such implementation, but it can be added later.
I think document.Value/document.Document should have some Hash function to use it in hash table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not rely on the implicit behavior of Document implementations as the Document interface doesn't guarantee the order of iteration.
Instead, we must use the document.Fields() function, which gets the document keys and sorts them, and use GetByField for each key.

Copy link
Contributor Author

@tdakkota tdakkota Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pk()/literal tests were failing, when I used documentMask.GetByField, it tries to get a value from masked document, but literals and functions values are not stored in document.
So, I added IterateInOrder function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to be a bug, they should be stored in DocumentMask:

genji> create table foo;
genji> insert into foo(a) values (1), (2), (3);
genji> select pk() from foo;
{
  "pk()": 1
}
{
  "pk()": 2
}
{
  "pk()": 3
}

Copy link
Contributor

@tie tie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have some comments after skimming through the changes.

sql/planner/hash_set.go Outdated Show resolved Hide resolved
sql/planner/optimizer.go Outdated Show resolved Hide resolved
@tie
Copy link
Contributor

tie commented Oct 24, 2020

@tdakkota thanks! Can you please mark the discussions as resolved? It looks like only PR author can do that in GitHub…

sql/planner/distinct.go Outdated Show resolved Hide resolved
sql/planner/optimizer.go Show resolved Hide resolved
sql/planner/tree.go Outdated Show resolved Hide resolved
sql/planner/hash_set.go Outdated Show resolved Hide resolved
document/document.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@asdine asdine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tdakkota !

@asdine asdine merged commit 3f06b4a into chaisql:master Oct 31, 2020
@asdine asdine added this to the v0.9.0 milestone Oct 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants