-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
physicalplan: add support for multi-stage execution of corr, covar_samp, sqrdiff, and regr_count aggregate functions. #76754
Conversation
Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. I have added a few people who may be able to assist in reviewing:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/sem/builtins/aggregate_builtins.go, line 3698 at r1 (raw file):
dd := res.(*tree.DDecimal) _, err = tree.DecimalCtx.Round(&dd.Decimal, &a.sqrDiff)
We need to round here to decimal context to get rid of differences caused by reordering (discussed previously).
pkg/sql/sem/builtins/aggregate_builtins.go, line 4046 at r1 (raw file):
return tree.DNull, nil } sqrDiff, err := a.agg.intermediateResult()
variance and var_pop use sqrdiff as intermediate calculations, so no rounding here.
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
query FFFFF SELECT covar_pop(y, x), covar_pop(int_y, int_x), covar_pop(y, int_x), covar_pop(int_y, x), round(covar_pop(dy, dx), 7)
I split the covar_pop and covar_samp because logic test failed with fakedist-* configs. To reproduce the select should look like:
select covar_pop(y, x), covar_samp(y, x)
from statistics_agg_test
i.e. only decimal calculations were affected. I tried to combine different functions like regr_sxx and regr_sxy and tests also failed.
Should the tests run for each function separately?
Thank you for updating your pull request. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think you're almost ready to close that meta issue :)
Reviewed 7 of 9 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mneverov and @yuzefovich)
-- commits, line 5 at r2:
nit: I think these are the last remaining aggregates, so you could use Fixes
here to automatically close the issue when the PR merges.
pkg/sql/sem/builtins/aggregate_builtins.go, line 3698 at r1 (raw file):
Previously, mneverov (Max Neverov) wrote…
We need to round here to decimal context to get rid of differences caused by reordering (discussed previously).
nit: consider adding this note as a comment in the code.
pkg/sql/sem/builtins/aggregate_builtins.go, line 256 at r2 (raw file):
)), // The input signature is: SQDIFF, SUM, COUNT
nit: s/SQDIFF/SQRDIFF/
.
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, mneverov (Max Neverov) wrote…
I split the covar_pop and covar_samp because logic test failed with fakedist-* configs. To reproduce the select should look like:
select covar_pop(y, x), covar_samp(y, x) from statistics_agg_test
i.e. only decimal calculations were affected. I tried to combine different functions like regr_sxx and regr_sxy and tests also failed.
Should the tests run for each function separately?
Huh, this is surprising to me. Do you have an idea for the reasons of such behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Huh, this is surprising to me. Do you have an idea for the reasons of such behavior?
I did some experiments with Reset
function as I thought it was the cause, but results were the same. I also tried different combinations of the aggregate functions. I don't have a theory except that there is a bug in the aggregate functions implementation. I'll try to hunt it down.
Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, mneverov (Max Neverov) wrote…
I did some experiments with
Reset
function as I thought it was the cause, but results were the same. I also tried different combinations of the aggregate functions. I don't have a theory except that there is a bug in the aggregate functions implementation. I'll try to hunt it down.
I found that regressionAccumulatorDecimalBase sxx
, syy
, and sxy
fields get corrupted. To reproduce:
make testlogic FILES=aggregate SUBTESTS='corrupt_combine'
combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=316.66666666666666667 sy1=6 syy1=2 sxy1=25
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine BEFORE n1=2 sx1=35 sxx1=316.66666666666666667 sy1=3 syy1=0.2 sxy1=2.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=520.83333333333333333 sy1=6 syy1=1.7 sxy1=2E+1
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
[20:13:15] -- FAIL
logic.go:2498:
expected:
8.33333333333333 12.5
but found (query options: "") :
6.66666666666667 10
Since combine function is executed for each covar_pop
, covar_samp
their arguments should not affect each other execution, i.e. for both executions it should be:
combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
, but in the second case sxx=316.6
i.e. the result of the previous execution, and syy=0.2, sxy=2.5
i.e. results of the prev. execution divided by 10.
I couldn't find the reason for that yet. There is no math. operations in aggregate_builtins
that suit this pattern as well as far as I can see no reference leakage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mneverov and @yuzefovich)
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, mneverov (Max Neverov) wrote…
I found that regressionAccumulatorDecimalBase
sxx
,syy
, andsxy
fields get corrupted. To reproduce:make testlogic FILES=aggregate SUBTESTS='corrupt_combine' combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5 combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0 combine AFTER n1=3 sx1=7E+1 sxx1=316.66666666666666667 sy1=6 syy1=2 sxy1=25 combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0 combine BEFORE n1=2 sx1=35 sxx1=316.66666666666666667 sy1=3 syy1=0.2 sxy1=2.5 combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0 combine AFTER n1=3 sx1=7E+1 sxx1=520.83333333333333333 sy1=6 syy1=1.7 sxy1=2E+1 combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0 [20:13:15] -- FAIL logic.go:2498: expected: 8.33333333333333 12.5 but found (query options: "") : 6.66666666666667 10Since combine function is executed for each
covar_pop
,covar_samp
their arguments should not affect each other execution, i.e. for both executions it should be:combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5 combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
, but in the second case
sxx=316.6
i.e. the result of the previous execution, andsyy=0.2, sxy=2.5
i.e. results of the prev. execution divided by 10.
I couldn't find the reason for that yet. There is no math. operations inaggregate_builtins
that suit this pattern as well as far as I can see no reference leakage
I pushed the commit that seems to fix things. TBH not 100% why it works - seems like we're mutating the decimals we don't want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I pushed the commit that seems to fix things. TBH not 100% why it works - seems like we're mutating the decimals we don't want to.
Thanks for your help! and sorry i introduced the bug 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on all of these aggregate functions!
Reviewed 2 of 2 files at r6, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mneverov)
pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):
Previously, mneverov (Max Neverov) wrote…
Thanks for your help! and sorry i introduced the bug 😬
Np!
bors r=yuzefovich |
There is a merge conflict with #76963, I'll wait for that PR to merge, rebase this one and will bors it then. |
sqrdiff, and regr_count aggregate functions. Fixes cockroachdb#58347. Release note (performance improvement): corr, covar_samp, sqrdiff, and regr_count aggregate functions are now evaluated more efficiently in a distributed setting
bors r=yuzefovich |
Build succeeded: |
Fixes: #58347.
Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting