Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Cover struct columns in fuzz testing #7618

Closed
2 tasks
vuule opened this issue Mar 16, 2021 · 6 comments · Fixed by #9180
Closed
2 tasks

[FEA] Cover struct columns in fuzz testing #7618

vuule opened this issue Mar 16, 2021 · 6 comments · Fixed by #9180
Assignees
Labels
cuIO cuIO issue feature request New feature or request tests Unit testing for project

Comments

@vuule
Copy link
Contributor

vuule commented Mar 16, 2021

The cuIO fuzz tests currently cannot generate input with struct columns. cudf now support this type of data for Parquet IO, so coverage should be expanded to include columns of struct data type.

Ideally, the generated columns should cover the following:

  • Variable levels of nesting;
  • Combination of nested structs, lists and non-nested data types;
@vuule vuule added feature request New feature or request Needs Triage Need team to review and classify labels Mar 16, 2021
@vuule vuule removed the Needs Triage Need team to review and classify label Mar 16, 2021
@vuule vuule added the tests Unit testing for project label Mar 16, 2021
@vuule
Copy link
Contributor Author

vuule commented Mar 16, 2021

CC @devavret and @nvdbaranec to add any corner cases that need to be covered.

@devavret
Copy link
Contributor

  1. One of the children is strings
  2. Columns have an offset (sliced column)
  3. Many permutations/combinations of nullability of levels. (every level nullable, no level nullable etc)

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@vuule
Copy link
Contributor Author

vuule commented Nov 4, 2021

what is missing before we can close this one?

@nvdbaranec
Copy link
Contributor

I would add: Many permutations of empty things. Empty strings, empty lists, empty structs. These tend to expose slightly different problems than just null values (particularly with strings, where various child columns can be nullptr, etc).

@galipremsagar
Copy link
Contributor

galipremsagar commented Nov 8, 2021

#9180 has already started paying dividends for ORC struct support fuzz-testing: #9395, #9179

I would like to verify the nested empty types cases that @nvdbaranec pointed out and there is some cleanup to be done hence moving this to 22.02.

rapids-bot bot pushed a commit that referenced this issue Jan 20, 2022
Resolves: #7618 

This PR adds struct dtype support in data-generator for fuzz-testing.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #9180
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request tests Unit testing for project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants