-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17733: [C++] Take index_width into account when filling nulls in index buffer #14129
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, and thank you for fixing this!
auto dict_type = dictionary(uint16(), utf8()); | ||
auto dict_one = DictArrayFromJSON(dict_type, "[]", "[]"); | ||
auto dict_two = DictArrayFromJSON(dict_type, "[0, 1, null]", "[\"A0\", \"A1\"]"); | ||
ASSERT_OK_AND_ASSIGN(auto concat_actual, Concatenate({dict_one, dict_two})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it might be worth concatenating with a run of a few nulls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion, makes sense. Done in commit: 204981d
@@ -539,4 +539,14 @@ TEST_F(ConcatenateTest, OffsetOverflow) { | |||
ASSERT_RAISES(Invalid, Concatenate({fake_long, fake_long}).status()); | |||
} | |||
|
|||
TEST_F(ConcatenateTest, DictionaryConcatenateWithEmptyUint16) { | |||
auto dict_type = dictionary(uint16(), utf8()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment here mentioning the JIRA number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done now (commit: 8d02222)
Alright, the test failures look to be unrelated to what's going on here/are tracked elsewhere, so this should be good. Thanks! |
Benchmark runs are scheduled for baseline = 6926672 and contender = 68e0fa7. 68e0fa7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
…n index buffer (apache#14129) Take into account index_width when offsetting by position into out_data. Otherwise we offset position bytes into the array, but we want to offset position places into the array. Authored-by: Rasmus Johansen <[email protected]> Signed-off-by: David Li <[email protected]>
…n index buffer (apache#14129) Take into account index_width when offsetting by position into out_data. Otherwise we offset position bytes into the array, but we want to offset position places into the array. Authored-by: Rasmus Johansen <[email protected]> Signed-off-by: David Li <[email protected]>
Take into account index_width when offsetting by position into out_data. Otherwise we offset position bytes into the array, but we want to offset position places into the array.