-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add column_device_view to orc writer #7676
Add column_device_view to orc writer #7676
Conversation
Add column_device_view members
…rc-column-device-view
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7676 +/- ##
===============================================
+ Coverage 81.86% 82.53% +0.66%
===============================================
Files 101 101
Lines 16884 17453 +569
===============================================
+ Hits 13822 14404 +582
+ Misses 3062 3049 -13
Continue to review full report at Codecov.
|
…rc-column-device-view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of the changes. Just needs more span and to add missing docs. And the span use can be pushed to another PR IMO.
Have you run the benchmarks?
gpuInitDictionaryIndices(DictionaryChunk *chunks, | ||
const table_device_view view, | ||
uint32_t *dict_data, | ||
uint32_t *dict_index, | ||
size_t row_index_stride, | ||
size_type *str_col_ids, | ||
uint32_t num_columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should (almost) all be spans/2dspans, but we might want to leave this for another PR, given the urgency of this PR.
Performance impact is captured here. |
I have removed the link to that issue for now but it talks about removing nvstrdesc_s in the writers only which has been done in this PR. |
@devavret to clarify the issue. My interpretation is that all nvstrdesc_s uses should be replaced. |
That is true. It just might need one more PR. So it could be an "addresses" here. |
rerun tests |
@gpucibot merge |
In PR #7676 the length of the current string being referred to while building stripe dictionaries was always set to 0 while incrementing the dictionary character count of a StripeDictionary. This led to corrupted strings when the dictionary encoding was used as noted in issue #7741. This has been fixed in this PR. Fixes #7741 Authors: - Kumar Aatish (@kaatish) Approvers: - Vukasin Milovanovic (@vuule) - Nghia Truong (@ttnghia) URL: #7744
This PR adds column_device_view members to EncChunk, DictionaryChunk and StripeDictionary structures which are used in the ORC writer. The idea is to replace members in these structures which replicate the same information. Usage of nvstrdesc_s has also been eliminated in the ORC writer.
Fixes #7347, Addresses #5682, Addresses #7334