Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support ORC read/write with Fixed-point data #7126

Closed
2 tasks done
razajafri opened this issue Jan 12, 2021 · 5 comments
Closed
2 tasks done

[FEA] Support ORC read/write with Fixed-point data #7126

razajafri opened this issue Jan 12, 2021 · 5 comments
Assignees
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS

Comments

@razajafri
Copy link
Contributor

razajafri commented Jan 12, 2021

Is your feature request related to a problem? Please describe.
Need Decimal support for ORC read/write

Additional context
Issue #6474 was closed prematurely it seems because there is no Decimal type support for ORC

  • Reader support
  • Writer support
@razajafri razajafri added feature request New feature or request Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Jan 12, 2021
@kkraus14 kkraus14 added cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Jan 14, 2021
@github-actions
Copy link

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

@github-actions github-actions bot added the stale label Feb 16, 2021
@sameerz
Copy link
Contributor

sameerz commented Feb 18, 2021

This is still needed; just got a customer request for this.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@sameerz
Copy link
Contributor

sameerz commented Mar 22, 2021

Still needed.

rapids-bot bot pushed a commit that referenced this issue Apr 20, 2021
Added support for Decimal/fixed-point column in ORC reader along with test cases. All decimal columns would be read as Decimal64 type column, and if precision is >18, it will loudly fail. This PR also remove couple of options which are of no use after the addition of Decimal support.

#7126

Authors:
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

Approvers:
  - Devavret Makkar (https://github.com/devavret)
  - Vukasin Milovanovic (https://github.com/vuule)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #7970
rapids-bot bot pushed a commit that referenced this issue May 18, 2021
Closes #8159, #7126

Current implementation uses an array to hold the exact size of each encoded element before the encode step. This allows us to simplify the encoding (each element encode is independent) and to allocate streams of exact size instead of the worst-case. The process is different from other types because decimal data streams do not use RLE encoding.

Will add benchmarks once data generator can produce decimal data.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - Devavret Makkar (https://github.com/devavret)
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #8198
@vuule
Copy link
Contributor

vuule commented May 19, 2021

Closed with #8198

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

No branches or pull requests

5 participants