Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV and JSON parsing should check for nulls. #4654

Open
Tracked by #2063
revans2 opened this issue Jan 28, 2022 · 0 comments
Open
Tracked by #2063

CSV and JSON parsing should check for nulls. #4654

revans2 opened this issue Jan 28, 2022 · 0 comments
Labels
task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jan 28, 2022

Describe the task
When reviewing some code I noticed that the CSV parser will throw an exception if the schema for the column is not nullable, but the data in it is nullable. I got a little scared because we are not checking for that in our code. But when I went to reproduce the issue, spark just marked all of the columns as nullable, despite my wishes, and the CSV parsing code. I did a quick look at the Spark code to try to see where this switch was happening, and I could not find it. It would be good for us to make sure we understand what is happening and that we have all of the cases covered. This is not critical at all, just something I noticed and it made me a bit concerned.

@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing labels Jan 28, 2022
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Feb 1, 2022
@revans2 revans2 mentioned this issue Oct 27, 2022
38 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

No branches or pull requests

2 participants