-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17479: [Go] Add ArraySpan and utilities #13929
Conversation
@zeroshade Keep in mind that ArraySpan in C++ is merely a performance optimization over ArrayData. That may not make sense or be important for Go. |
@pitrou yup I'm aware. I haven't done the testing yet to prove it (I'll do so when I have more of the compute stuff fleshed out) but in theory it should provide similar optimization in Go by reducing the number of pointers and heap objects due to the statically sized arrays. In addition the semantics are different. ArrayData has an atomic ref count which is managed with a retain and release function. When constructed it automatically calls retain on any buffers and children it was constructed with, etc. ArraySpan doesn't attempt to make any handling of that ref count and isn't considered to own the buffers, and the ArraySpan objects can typically be passed around and copied when needed, adjusting the buffer count for parent buffers only when converted back to an array data at the end. Thus saving all of the retain and release management during kernel execution. |
Just wondering - would ArraySpan be an idiomatic name in Go? I think the C++ name derives from |
@lidavidm that's fair point, maybe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
On second thought, naming it ArraySlice
might be confusing if it doesn't really have the same semantics as a Golang slice, so keeping ArraySpan
is reasonable
Offset int64 | ||
Buffers [3]BufferSpan | ||
|
||
Scratch [2]uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(IIRC, the scratch space is for dealing with offsets right? Might be good to document what this field is for since it's not immediately obvious)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, it's used for offsets and the union type code
Benchmark runs are scheduled for baseline = 8c17925 and contender = 258173d. 258173d is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc. It was able to be separated out from the function interface definitions, so I was able to make this PR while apache#13924 is still being reviewed Authored-by: Matt Topol <[email protected]> Signed-off-by: Matt Topol <[email protected]>
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc.
It was able to be separated out from the function interface definitions, so I was able to make this PR while #13924 is still being reviewed