Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow tz-aware origin parameter in pandas.to_datetime #37482

Closed
venaturum opened this issue Oct 29, 2020 · 2 comments
Closed

ENH: Allow tz-aware origin parameter in pandas.to_datetime #37482

venaturum opened this issue Oct 29, 2020 · 2 comments
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@venaturum
Copy link
Contributor

Is your feature request related to a problem?

I wish I could use pandas to map back and forth between tz-aware timestamps and floats.

This has been raised before
#16842

however it was argued that the incumbent behaviour did not need altering, but I'd like to plead my case below.

Describe the solution you'd like

Currently if origin parameter in pd.to_datetime is tz-aware an exception is produced;
ValueError: origin offset ..... must be tz-naive

I believe if a tz-aware origin is used then the result of pd.to_datetime should be a tz-aware timestamp.

API breaking implications

I believe an implementation is possible which does not break the api, where the timezone is extracted from the origin parameter (and it could be None) and used to localize other timestamps in the code where necessary.

Describe alternatives you've considered

I have considered writing my own to_datetime to accept timezone aware origins, or perhaps using tz_convert as a mediator between the mapping, however I would still argue the default behaviour of to_datetime is not ideal.

Additional context

Consider the following code, and for context in the Australia/Sydney timezone clocks were wound forward an hour at 2am on the 4th of October 2020.

import pandas as pd

tz = pytz.timezone('Australia/Sydney')
origin = pd.Timestamp('2020-10-04', tz=tz)
test_date_1 = pd.Timestamp('2020-10-04 1:00', tz=tz)
test_date_2 = pd.Timestamp('2020-10-04 3:00', tz=tz)

print(f'There is {(test_date_1 - origin).total_seconds()/3600} hours from origin to {test_date_1}')
print(f'There is {(test_date_2 - origin).total_seconds()/3600} hours from origin to {test_date_2}')

The code will correctly calculate the time delta, with an origin set to the start of the day:
There is 1.0 hours from origin to 2020-10-04 01:00:00+10:00
There is 2.0 hours from origin to 2020-10-04 03:00:00+11:00

How can I map the number 2 back to test_date_2?

This code

pd.to_datetime(2, unit='h', origin=origin)

produces
ValueError: origin offset 2020-10-04 00:00:00+10:00 must be tz-naive

This code

pd.to_datetime(2, unit='h', origin=origin.tz_localize(None))

produces a tz-naive Timestamp: Timestamp('2020-10-04 02:00:00')

Trying to localize it of course does not work, and raises an exception

pd.to_datetime(2, unit='h', origin=origin.tz_localize(None)).tz_localize(tz)

NonExistentTimeError: 2020-10-04 02:00:00

The answer to "What time is 2hrs past midnight on the 4th of October 2020 in Sydney. Australia" is not ambiguous. It has an answer and I believe pandas should be able to accomodate this.

@venaturum venaturum added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 29, 2020
@Liam3851
Copy link
Contributor

For your use case ("what time is 2 hours past midnight on the 4th of October 2020 in Sydney, Australia"), why not instead of to_datetime with origin:

origin = pd.to_datetime('2020-10-04').tz_localize('Australia/Sydney')
pd.to_datetime(2, unit='h', origin=origin.tz_localize(None))

just use timedelta:

origin = pd.to_datetime('2020-10-04').tz_localize('Australia/Sydney')
pd.to_timedelta(2, 'h') + origin

@venaturum
Copy link
Contributor Author

@Liam3851 Thanks mate, that'll solve my problem, and a much better workaround than I was looking at. Sometimes you can't see the forest for the trees!

Dev team, I'll let you decide if you want to pursue this enhancement. No complaints if it is closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants