Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexecwindow: implement first_value, last_value, and nth_value #67764

Merged
merged 4 commits into from
Jul 20, 2021

Conversation

DrewKimball
Copy link
Collaborator

@DrewKimball DrewKimball commented Jul 19, 2021

logictest: add testfile to ensure that window functions are vectorized

This commit adds a logic testfile that executes the window functions
currently supported by the vectorized engine with
SET vectorize = experimental_always. This tests that supported
window functions can actually be executed in the vectorized engine.

Release note: None

colexecwindow: add first_value, last_value, and nth_value

This patch adds vectorized implementations of the first_value,
last_value, and nth_value window functions. These functions
return an expression evaluated at the first, last, and nth positions
respectively in the window frame for the current row.

Release note (sql change): first_value, last_value, and nth_value
window functions can now be executed in the vectorized engine. This
allows for faster execution time, and also removes the need for
conversions to and from row format.

colexecwindow: test window framer with different memory limits

Previously, the TestWindowFramer used math.MaxInt64 as the memory
limit for the SpillingBuffer used to buffer all tuples in a partition.
This patch modifies the test to choose a random memory limit and fixes
a bug that could cause windowFramer operators to skip a peer group
when seeking to frame start and end indexes in RANGE or GROUPS mode.

Release note: None

colexec: add range check for binary operations with timestamps

This commit adds a range check after additions or subtractions that
produce a timestamp result. This ensures that the vectorized engine
produces the same error as the row engine when the result of the
operation is out of range.

Release note: None

Fixes #37035

@DrewKimball DrewKimball requested review from yuzefovich and a team July 19, 2021 18:16
@DrewKimball DrewKimball requested a review from a team as a code owner July 19, 2021 18:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! This one is now quite easy given all the previous hard work. I only have a few minor comments.

One kind of test that I realized we're missing is making sure that all currently supported by the vectorized engine window functions are, in fact, being executed via the vectorized. I think originally, when I added a couple, I introduced some simple queries with row_number, rank, dense_rank, percent_rank, and cume_dist at the bottom of window logic test file, but it would probably be worth moving those into a separate file vectorize_window or something like that. There, we would use SET vectorize=experimental_always to make sure that the vectorized is used. And then extend the file with the recent increased support.

Reviewed 6 of 6 files at r1, 3 of 3 files at r2, 15 of 15 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/colexec/colexecwindow/first_last_nth_value_tmpl.go, line 155 at r3 (raw file):

			if partitionCol[i] {
				// Don't break for the start of the current partition.
				if !isPartitionStart || i != startIdx {

nit: I wonder whether it would be cleaner to pull this conditional outside of the loop and break here unconditionally. I'm thinking of something like:

i := startIdx
if isPartitionStart {
  i++
}
for ; i < n; i++ {
  if partitionCol[i] {
    break
  }
}

pkg/sql/colexec/execgen/cmd/execgen/first_last_nth_value_gen.go, line 37 at r3 (raw file):

}

func genFirstValueOp(inputFileContents string, wr io.Writer) error {

nit: these three generator functions are basically the same, and we could define a generator constructor in init() below that would be taking a couple of arguments and returning the correct generator. Take a look at projection_ops_gen.go for an example.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 808 at r1 (raw file):

// timestampRangeCheck should be added at the end of operations that modify and
// return timestamps in order to ensure that the vectorized engine returns the
// same errors as the row engine.

nit: mention t_res local variable.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 810 at r1 (raw file):

// same errors as the row engine.
const timestampRangeCheck = `	
if t_res.After(tree.MaxSupportedTime) || t_res.Before(tree.MinSupportedTime) {

Looks like the row engine also rounds to tree.Microsecond precision first, I think we should do too.


pkg/sql/colexec/execgen/cmd/execgen/range_offset_handler_gen.go, line 306 at r1 (raw file):

}

const dateToTimeCastStr = `

nit: add a quick comment about which "arguments" should this format string be used with.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also, I'd be curious to see the absolute numbers of the microbenchmarks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit for the vectorized_window testfile.

Here are numbers for first_value, the other two are about the same:

cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkWindowFunctions/FIRST_VALUE/rows=4096/partition=true/order=true-12         	    2079	    534864 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=4096/partition=true/order=false-12        	    2210	    537169 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=4096/partition=false/order=true-12        	    2211	    536847 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=4096/partition=false/order=false-12       	    2221	    534657 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=32768/partition=true/order=true-12        	     280	   4225680 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=32768/partition=true/order=false-12       	     284	   4223189 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=32768/partition=false/order=true-12       	     285	   4222705 ns/op
BenchmarkWindowFunctions/FIRST_VALUE/rows=32768/partition=false/order=false-12      	     283	   4244977 ns/op

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/colexecwindow/first_last_nth_value_tmpl.go, line 155 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I wonder whether it would be cleaner to pull this conditional outside of the loop and break here unconditionally. I'm thinking of something like:

i := startIdx
if isPartitionStart {
  i++
}
for ; i < n; i++ {
  if partitionCol[i] {
    break
  }
}

Done. I've also pulled out the common logic between lead/lag and first_value/last_value/nth_value.


pkg/sql/colexec/execgen/cmd/execgen/first_last_nth_value_gen.go, line 37 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: these three generator functions are basically the same, and we could define a generator constructor in init() below that would be taking a couple of arguments and returning the correct generator. Take a look at projection_ops_gen.go for an example.

Done.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 808 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: mention t_res local variable.

Done.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 810 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Looks like the row engine also rounds to tree.Microsecond precision first, I think we should do too.

Done.


pkg/sql/colexec/execgen/cmd/execgen/range_offset_handler_gen.go, line 306 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: add a quick comment about which "arguments" should this format string be used with.

Done.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 22 of 22 files at r4, 3 of 3 files at r5, 19 of 19 files at r6, 2 of 2 files at r7.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

@DrewKimball
Copy link
Collaborator Author

TFTR!

This commit adds a range check after additions or subtractions that
produce a timestamp result. This ensures that the vectorized engine
produces the same error as the row engine when the result of the
operation is out of range.

Release note: None
Previously, the `TestWindowFramer` used `math.MaxInt64` as the memory
limit for the `SpillingBuffer` used to buffer all tuples in a partition.
This patch modifies the test to choose a random memory limit and fixes
a bug that could cause `windowFramer` operators to skip a peer group
when seeking to frame start and end indexes in `RANGE` or `GROUPS` mode.

Release note: None
This patch adds vectorized implementations of the `first_value`,
`last_value`, and `nth_value` window functions. These functions
return an expression evaluated at the first, last, and nth positions
respectively in the window frame for the current row.

Fixes cockroachdb#37035

Release note (sql change): first_value, last_value, and nth_value
window functions can now be executed in the vectorized engine. This
allows for faster execution time, and also removes the need for
conversions to and from row format.
This commit adds a logic testfile that executes the window functions
currently supported by the vectorized engine with
`SET vectorize = experimental_always`. This tests that supported
window functions can actually be executed in the vectorized engine.

Release note: None
@DrewKimball
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 20, 2021

Build succeeded:

@craig craig bot merged commit e45a521 into cockroachdb:master Jul 20, 2021
@DrewKimball DrewKimball deleted the pure_window_funcs branch July 20, 2021 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exec: support all window functions in vectorized engine
3 participants