Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect results returned by min, max and avg #1000

Merged

Conversation

Yury-Fridlyand
Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand commented Nov 1, 2022

Fix incorrect results returned by min, max and avg when processing null/missing values only.

Signed-off-by: Yury-Fridlyand [email protected]

See team review & discussion in Bit-Quill#145

Description

Before

opensearchsql> select min(int0) from calcs where int0 IS NULL;
fetched rows / total rows = 1/1
+-------------+
| min(int0)   |
|-------------|
| 2147483647  |
+-------------+

After

opensearchsql> select min(int0) from calcs where int0 IS NULL;
fetched rows / total rows = 1/1
+-------------+
| min(int0)   |
|-------------|
| null        |
+-------------+

Fixed:

  • min
  • max
  • avg

Not fixed:

  • sum
opensearchsql> select sum(int0) from calcs where int0 IS NULL;
fetched rows / total rows = 1/1
+-------------+
| sum(int0)   |
|-------------|
| 0           |
+-------------+
mysql> select sum(int0) from calcs where int0 IS NULL;

+-----------+
| sum(int0) |
+-----------+
|      NULL |
+-----------+
1 row in set (0.00 sec)

This requires much more changes.

Not affected

  • var_samp
  • var_pop
  • stddev_samp
  • stddev_pop

Issues Resolved

Fixes #817

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

… on null/missing values.

Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Fix incorrect results returned by `min`, `max` and `avg`
@Yury-Fridlyand Yury-Fridlyand requested a review from a team as a code owner November 1, 2022 00:51
@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.60%. Comparing base (bfad32c) to head (7d60f22).
Report is 518 commits behind head on 2.x.

Additional details and impacted files
@@             Coverage Diff              @@
##                2.x    #1000      +/-   ##
============================================
- Coverage     97.95%   97.60%   -0.35%     
- Complexity     3137     3186      +49     
============================================
  Files           303      308       +5     
  Lines          7783     7983     +200     
  Branches        501      520      +19     
============================================
+ Hits           7624     7792     +168     
- Misses          158      190      +32     
  Partials          1        1              
Flag Coverage Δ
sql-engine 97.60% <100.00%> (-0.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MaxKsyunz
MaxKsyunz previously approved these changes Nov 1, 2022
penghuo
penghuo previously approved these changes Nov 1, 2022
Copy link
Collaborator

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@dai-chen
Copy link
Collaborator

dai-chen commented Nov 2, 2022

The CodeQL is failing and seems caused by some regular expression in GrokUtils. I assume it's introduced by other PR earlier? https://github.com/opensearch-project/sql/pull/1000/checks?check_run_id=9215922956

@Yury-Fridlyand
Copy link
Collaborator Author

The CodeQL is failing and seems caused by some regular expression in GrokUtils. I assume it's introduced by other PR earlier? https://github.com/opensearch-project/sql/pull/1000/checks?check_run_id=9215922956

Yes, see #889, introduced in #813.

@dai-chen dai-chen added the bug Something isn't working label Nov 2, 2022
@dai-chen
Copy link
Collaborator

dai-chen commented Nov 2, 2022

The CodeQL is failing and seems caused by some regular expression in GrokUtils. I assume it's introduced by other PR earlier? https://github.com/opensearch-project/sql/pull/1000/checks?check_run_id=9215922956

Yes, see #889, introduced in #813.

I see. Not sure why it doesn't fail on all other PRs.

Btw, to follow the same procedure, should we choose 2.x branch as target and label with backport 2.4

@Yury-Fridlyand Yury-Fridlyand changed the base branch from 2.4 to 2.x November 2, 2022 18:54
@Yury-Fridlyand Yury-Fridlyand dismissed stale reviews from penghuo and MaxKsyunz November 2, 2022 18:54

The base branch was changed.

@Yury-Fridlyand
Copy link
Collaborator Author

Done. Please, re-review.

@dai-chen
Copy link
Collaborator

dai-chen commented Nov 2, 2022

Done. Please, re-review.

Looks good. Thanks!

@dai-chen dai-chen merged commit 4282450 into opensearch-project:2.x Nov 3, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 3, 2022
* Fix incorrect results returned by `min`, `max` and `avg` aggregations on null/missing values.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix indentation.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Activate and fix integration tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add more tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

Signed-off-by: Yury-Fridlyand <[email protected]>
(cherry picked from commit 4282450)
dai-chen pushed a commit that referenced this pull request Nov 3, 2022
* Fix incorrect results returned by `min`, `max` and `avg` aggregations on null/missing values.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix indentation.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Activate and fix integration tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add more tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

Signed-off-by: Yury-Fridlyand <[email protected]>
(cherry picked from commit 4282450)

Co-authored-by: Yury-Fridlyand <[email protected]>
@Yury-Fridlyand Yury-Fridlyand deleted the integ-fix-aggreg-on-nulls branch November 16, 2022 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.4 bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Aggregation functions return invalid result when processing NULLs
5 participants