Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17479: [Go] Add ArraySpan and utilities #13929

Merged
merged 4 commits into from
Aug 22, 2022

Conversation

zeroshade
Copy link
Member

Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc.

It was able to be separated out from the function interface definitions, so I was able to make this PR while #13924 is still being reviewed

@github-actions
Copy link

@pitrou
Copy link
Member

pitrou commented Aug 20, 2022

@zeroshade Keep in mind that ArraySpan in C++ is merely a performance optimization over ArrayData. That may not make sense or be important for Go.

@pitrou
Copy link
Member

pitrou commented Aug 20, 2022

(also, I don't think you need to tag @amol- or @kszucs as reviewers here)

@zeroshade
Copy link
Member Author

@pitrou yup I'm aware. I haven't done the testing yet to prove it (I'll do so when I have more of the compute stuff fleshed out) but in theory it should provide similar optimization in Go by reducing the number of pointers and heap objects due to the statically sized arrays. In addition the semantics are different.

ArrayData has an atomic ref count which is managed with a retain and release function. When constructed it automatically calls retain on any buffers and children it was constructed with, etc. ArraySpan doesn't attempt to make any handling of that ref count and isn't considered to own the buffers, and the ArraySpan objects can typically be passed around and copied when needed, adjusting the buffer count for parent buffers only when converted back to an array data at the end.

Thus saving all of the retain and release management during kernel execution.

@lidavidm
Copy link
Member

Just wondering - would ArraySpan be an idiomatic name in Go? I think the C++ name derives from std::span

@zeroshade
Copy link
Member Author

@lidavidm that's fair point, maybe ArrayView? Or ArraySlice?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

On second thought, naming it ArraySlice might be confusing if it doesn't really have the same semantics as a Golang slice, so keeping ArraySpan is reasonable

Offset int64
Buffers [3]BufferSpan

Scratch [2]uint64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(IIRC, the scratch space is for dealing with offsets right? Might be good to document what this field is for since it's not immediately obvious)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, it's used for offsets and the union type code

@zeroshade zeroshade merged commit 258173d into apache:master Aug 22, 2022
@zeroshade zeroshade deleted the arrow-17479-arrayspan branch August 22, 2022 15:01
@ursabot
Copy link

ursabot commented Aug 22, 2022

Benchmark runs are scheduled for baseline = 8c17925 and contender = 258173d. 258173d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.48% ⬆️0.24%] test-mac-arm
[Failed ⬇️0.82% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.53% ⬆️0.04%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 258173da ec2-t3-xlarge-us-east-2
[Failed] 258173da test-mac-arm
[Failed] 258173da ursa-i9-9960x
[Finished] 258173da ursa-thinkcentre-m75q
[Finished] 8c17925a ec2-t3-xlarge-us-east-2
[Failed] 8c17925a test-mac-arm
[Failed] 8c17925a ursa-i9-9960x
[Finished] 8c17925a ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc.

It was able to be separated out from the function interface definitions, so I was able to make this PR while apache#13924 is still being reviewed

Authored-by: Matt Topol <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants