You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
however it was argued that the incumbent behaviour did not need altering, but I'd like to plead my case below.
Describe the solution you'd like
Currently if origin parameter in pd.to_datetime is tz-aware an exception is produced; ValueError: origin offset ..... must be tz-naive
I believe if a tz-aware origin is used then the result of pd.to_datetime should be a tz-aware timestamp.
API breaking implications
I believe an implementation is possible which does not break the api, where the timezone is extracted from the origin parameter (and it could be None) and used to localize other timestamps in the code where necessary.
Describe alternatives you've considered
I have considered writing my own to_datetime to accept timezone aware origins, or perhaps using tz_convert as a mediator between the mapping, however I would still argue the default behaviour of to_datetime is not ideal.
Additional context
Consider the following code, and for context in the Australia/Sydney timezone clocks were wound forward an hour at 2am on the 4th of October 2020.
importpandasaspdtz=pytz.timezone('Australia/Sydney')
origin=pd.Timestamp('2020-10-04', tz=tz)
test_date_1=pd.Timestamp('2020-10-04 1:00', tz=tz)
test_date_2=pd.Timestamp('2020-10-04 3:00', tz=tz)
print(f'There is {(test_date_1-origin).total_seconds()/3600} hours from origin to {test_date_1}')
print(f'There is {(test_date_2-origin).total_seconds()/3600} hours from origin to {test_date_2}')
The code will correctly calculate the time delta, with an origin set to the start of the day: There is 1.0 hours from origin to 2020-10-04 01:00:00+10:00 There is 2.0 hours from origin to 2020-10-04 03:00:00+11:00
How can I map the number 2 back to test_date_2?
This code
pd.to_datetime(2, unit='h', origin=origin)
produces ValueError: origin offset 2020-10-04 00:00:00+10:00 must be tz-naive
The answer to "What time is 2hrs past midnight on the 4th of October 2020 in Sydney. Australia" is not ambiguous. It has an answer and I believe pandas should be able to accomodate this.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
I wish I could use pandas to map back and forth between tz-aware timestamps and floats.
This has been raised before
#16842
however it was argued that the incumbent behaviour did not need altering, but I'd like to plead my case below.
Describe the solution you'd like
Currently if origin parameter in pd.to_datetime is tz-aware an exception is produced;
ValueError: origin offset ..... must be tz-naive
I believe if a tz-aware origin is used then the result of pd.to_datetime should be a tz-aware timestamp.
API breaking implications
I believe an implementation is possible which does not break the api, where the timezone is extracted from the origin parameter (and it could be None) and used to localize other timestamps in the code where necessary.
Describe alternatives you've considered
I have considered writing my own to_datetime to accept timezone aware origins, or perhaps using tz_convert as a mediator between the mapping, however I would still argue the default behaviour of to_datetime is not ideal.
Additional context
Consider the following code, and for context in the Australia/Sydney timezone clocks were wound forward an hour at 2am on the 4th of October 2020.
The code will correctly calculate the time delta, with an origin set to the start of the day:
There is 1.0 hours from origin to 2020-10-04 01:00:00+10:00
There is 2.0 hours from origin to 2020-10-04 03:00:00+11:00
How can I map the number 2 back to test_date_2?
This code
produces
ValueError: origin offset 2020-10-04 00:00:00+10:00 must be tz-naive
This code
produces a tz-naive Timestamp: Timestamp('2020-10-04 02:00:00')
Trying to localize it of course does not work, and raises an exception
NonExistentTimeError: 2020-10-04 02:00:00
The answer to "What time is 2hrs past midnight on the 4th of October 2020 in Sydney. Australia" is not ambiguous. It has an answer and I believe pandas should be able to accomodate this.
The text was updated successfully, but these errors were encountered: