Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support an incoming schema for ORC writing #8443

Closed
vuule opened this issue Jun 4, 2021 · 1 comment
Closed

[FEA] Support an incoming schema for ORC writing #8443

vuule opened this issue Jun 4, 2021 · 1 comment
Assignees
Labels
cuIO cuIO issue feature request New feature or request Python Affects Python cuDF API.

Comments

@vuule
Copy link
Contributor

vuule commented Jun 4, 2021

Based on question #6816

Possible implementation: reuse the nested metadata type in the ORC writer.

Depends on #7640, #7830

@vuule vuule added feature request New feature or request Python Affects Python cuDF API. cuIO cuIO issue labels Jun 4, 2021
@beckernick beckernick added this to the IO Data Type Expansion milestone Jul 16, 2021
@vuule vuule self-assigned this Aug 26, 2021
rapids-bot bot pushed a commit that referenced this issue Sep 22, 2021
Fixes #7830, #8443

Features:
- Use the new table metadata type that matches the table hierarchy, `table_input_metadata`.
- Support struct columns in the writer.

Changes:
- Null masks are encoded as aligned rowgroups to avoid invalid bits when the number of encoded rows is not divisible by 8 (except for the last rowgroup in each stripe). This also affects list columns. The issue is equivalent to #6763 (boolean columns only).
- Added pushdown masks that are used to determine which child elements should not be encoded, including null mask bits.
- Use pushdown masks for rowgroup alignment, null mask encoding and value encoding.
- Separated the null mask encoding from value encoding - can be further moved to a separate kernel call.

Breaking because the table metadata type has changed.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Jason Lowe (https://github.com/jlowe)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Robert (Bobby) Evans (https://github.com/revans2)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Devavret Makkar (https://github.com/devavret)
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #9025
@vuule
Copy link
Contributor Author

vuule commented Sep 23, 2021

Should have been closed by #9025

@vuule vuule closed this as completed Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

2 participants