-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for case where CSV Row has too many columns #594
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstrings!
def plugin_result(self, row): | ||
if None in row: | ||
raise DataGenError( | ||
f"Your CSV row has more columns than the CSV header: {row[None]}, {self.datasource}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful to indicate which row had the problem
@*davisagli* : The row[None] stuff outputs the "extra values" so you can
search for them.
…On Fri, Jan 28, 2022 at 7:03 AM David Glick ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In snowfakery/standard_plugins/datasets.py
<#594 (comment)>
:
>
def close(self):
self.results = None
self.file.close()
+ def plugin_result(self, row):
+ if None in row:
+ raise DataGenError(
+ f"Your CSV row has more columns than the CSV header: {row[None]}, {self.datasource}"
Would be helpful to indicate which row had the problem
—
Reply to this email directly, view it on GitHub
<#594 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAFMS7CZVTV4BFZJJU5S2TUYKV3HANCNFSM5M7DNULA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
@prescod Yeah I understand what it does, and it doesn't sound as helpful as a line number if the extra values happen to also appear in legitimate column positions. |
I want to add as few instructions to the golden path as possible. It's a
pretty rare situation. I was already reluctant to add
additional instructions to the golden path to deal with it. You would have
really hated my first solution which was zero overhead but gave even less
context. I think this is a good middle ground between making every user pay
for the error handling and giving enough information for the error case to
be discoverable. If you load your CSV into Sheets or Excel, the extra row
should be visually obvious.
…On Fri, Jan 28, 2022 at 11:35 AM David Glick ***@***.***> wrote:
@prescod <https://github.com/prescod> Yeah I understand what it does, and
it doesn't sound as helpful as a line number if the extra values happen to
also appear in legitimate column positions.
—
Reply to this email directly, view it on GitHub
<#594 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAFMS2KZ5AXPSFKYAU2KB3UYLVYBANCNFSM5M7DNULA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It's hard to imagine that incrementing a counter would affect performance noticeably, but the concern with performance here seems fair. |
Better error checking for the case where a CSV Row has too many columns.