Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve date/time hint handling for Pandas based on better testing #148

Merged
merged 8 commits into from
Dec 10, 2020

Conversation

vinceatbluelabs
Copy link
Contributor

Based on the new testing approach in #144, this PR adds component tests for date/time types while loading/exporting CSV files in Pandas.

Along the way some places for improvement were found!

Based on the new testing approach in #144, this PR adds component
tests for date/time types while loading/exporting CSV files in Pandas.

Along the way some places for improvement were found!
'YYYY-MM-DD': '%Y-%m-%d',
'MM/DD/YY': '%m/%d/%Y',
'DD/MM/YY': '%d/%m/%Y',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were wrong! %Y is a four digit year, %y is a two digit year.

@@ -39,16 +38,19 @@
# account for MM/DD time.
#
# https://github.com/bluelabsio/records-mover/issues/75
#
python_date_format_from_hints: Dict[Union[HintDateFormat, Literal['DD/MM/YY']], str] = {
python_date_format_from_hints: Dict[HintDateFormat, str] = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is now a str alias, no reason to add more things to it.

else:
cant_handle_hint(fail_if_cant_handle_hint, 'dateformat', hints)
quiet_remove(unhandled_hints, 'dateformat')

# pandas can't seem to export a date and time together :(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't right; it certainly can. What it can't do is export a time without a date, or a date without a time. That's what we have prep_for_csv.py for.

}

python_time_format_from_hints: Dict[HintTimeOnlyFormat, str] = {
'HH24:MI:SS': '%H:%M:%S',
'HH12:MI AM': '%I:%M:%S %p',
'HH:MI:SS': '%H:%M:%S',
'HH12:MI AM': '%I:%M %p',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were adding seconds here against what the user was telling us. We have a similar bug with datetime/datetimetz types - see #143

Once we have more solid tests here we can go back and generalize to_csv_options() and fix those along the way - to fix it with the current design would be basically remove (mostly-working) functionality.

@vinceatbluelabs vinceatbluelabs marked this pull request as ready for review December 9, 2020 22:03
Copy link
Contributor

@cwegrzyn cwegrzyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

records_mover/records/pandas/to_csv_options.py Outdated Show resolved Hide resolved
@vinceatbluelabs vinceatbluelabs merged commit e82ce75 into master Dec 10, 2020
@ryantimjohn ryantimjohn deleted the loosen_date_requirements_4 branch December 17, 2022 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants