Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added improvements in to_datetime Error reporting message #47597

Closed
wants to merge 5 commits into from

Conversation

dannyi96
Copy link
Contributor

@dannyi96 dannyi96 commented Jul 5, 2022

  • This PR aims to close 3 related issues -
    The exceptions from to_datetime(errors='raise') could include information about the exception #16757 ,
    BUG: Confusing error message when to_datetime fails for times #46509 ,
    BUG: to_datetime ParserError on incorrect element #47495

  • Changes description

  • Examples of changed error messages

    Code Old Error ( on 1.4.3 ) New Error
    pd.to_datetime(pd.Series(['2016-01-01', 'unparseable']), errors='raise') dateutil.parser._parser.ParserError: Unknown string format: unparseable ValueError: Unable to parse string "unparseable" at position 1
    pd.to_datetime(pd.Series(['2016-01-01', 'unparseable']), format='%Y-%m-%d', errors='raise') ValueError: time data unparseable doesn't match format specified ValueError: time data "unparseable" at position 1 doesn't match format specified
    pd.to_datetime(pd.Series(['2016-01-01', '12']), errors='raise') ValueError: Given date string not likely a datetime. ValueError: Unable to parse string "12" at position 1
    pd.to_datetime('11:0') pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 11:00:00 pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Cannot convert "11:0" at position 0 to datetime
    pd.to_datetime(["today", "yesterday"]) dateutil.parser._parser.ParserError: Unknown string format: today ValueError: Unable to parse string "yesterday" at position 1

To Do

  • add testcases covering above errors once given the initial go ahead. ( am bit new here, was thinking of sending the changes in chunks, instead of going more ahead in wrong direction 🙂 )

@datapythonista datapythonista added Bug Error Reporting Incorrect or improved errors from pandas labels Jul 5, 2022
@datapythonista
Copy link
Member

Thanks for the PR @dannyi96. Can you review the tests please? We usually test for error messages, and those tests are probably failing. Also, can you add a note to the release notes about this? Thank you!

@mroeschke
Copy link
Member

For Given date string not likely a datetime., I think it would be better to change the message where it was raised instead of raising a new exceptions with a new message.

@dannyi96
Copy link
Contributor Author

dannyi96 commented Jul 9, 2022

Thanks for the PR @dannyi96. Can you review the tests please? We usually test for error messages, and those tests are probably failing. Also, can you add a note to the release notes about this? Thank you!
thanks @datapythonista, sure will add this change

@dannyi96
Copy link
Contributor Author

dannyi96 commented Jul 9, 2022

For Given date string not likely a datetime., I think it would be better to change the message where it was raised instead of raising a new exceptions with a new message.

sure thing @mroeschke , one quick question here -
the method which generates this exception doesn't have the context of the index/position, so should we consider

  1. just leaving it to be a basic exception message change to f'Given date string {date_string} not likely a datetime.' and propagate this exception
  2. Pass the index information to this method (as a new parameter ) so we can generate clearer message f'Given date string {date_string} at position {index} not likely a datetime.' and propagate this exception
  3. (current approach) regenerate new exception post above thrown exception to add in the index related info.

@mroeschke
Copy link
Member

For Given date string not likely a datetime., I think it would be better to change the message where it was raised instead of raising a new exceptions with a new message.

sure thing @mroeschke , one quick question here - the method which generates this exception doesn't have the context of the index/position, so should we consider

1. just leaving it to be a basic exception message change to `f'Given date string {date_string} not likely a datetime.'` and propagate this exception

2. Pass the index information to this method (as a new parameter ) so we can generate clearer message `f'Given date string {date_string} at position {index} not likely a datetime.'` and propagate this exception

3. (current approach) regenerate new exception post above thrown exception to add in the index related info.

I think keeping it simple with option 1 is good for now

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you are trying to tackle 3 issues in this PR, but I would recommend splitting this PR into 3, addressing one issue per PR. It will help the review scope greatly.

@dannyi96
Copy link
Contributor Author

sure thing @mroeschke. that would indeed be better.
Will work on splitting the PRs.

@mroeschke
Copy link
Member

sure thing @mroeschke. that would indeed be better. Will work on splitting the PRs.

Thanks for your effort here so far @dannyi96. I'm supportive of the changes here generally. We just need to be a little mindful of exceptions in this part of the code base since there's a lot of "fall back" behavior triggered by exceptions.

@dannyi96
Copy link
Contributor Author

Thanks for your effort here so far @dannyi96. I'm supportive of the changes here generally. We just need to be a little mindful of exceptions in this part of the code base since there's a lot of "fall back" behavior triggered by exceptions.

thanks @mroeschke for the guidance/support 🙂
yes, looks like we need to be bit more careful in this portion of code.

@dannyi96
Copy link
Contributor Author

dannyi96 commented Aug 7, 2022

am closing this as the three issues are being addressed in 3 separate PRs -
#16757 - #47995,
#46509 - #47849,
#47495 - #47860

@dannyi96 dannyi96 closed this Aug 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants