Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR: disallow direct tz setting #3746

Closed
nipunbatra opened this issue Jun 3, 2013 · 9 comments · Fixed by #20510
Closed

ERR: disallow direct tz setting #3746

nipunbatra opened this issue Jun 3, 2013 · 9 comments · Fixed by #20510
Labels
Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Milestone

Comments

@nipunbatra
Copy link
Contributor

related to #3714
tz should not be settable

df=pd.DataFrame({"events":np.ones(len(dates))},index=motor["timestamps"].astype("datetime64[s]"))
df.describe()
events
count 2393
mean 1
std 0
min 1
25% 1
50% 1
75% 1
max 1
df

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2393 entries, 2013-05-30 11:58:36 to 2013-06-03 13:30:43
Data columns (total 1 columns):
events    2393  non-null values
dtypes: float64(1)


df.index.tz='Asia/Kolkata'
df.index

<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-30 17:28:36, ..., 2013-06-03 19:00:43]
Length: 2393, Freq: None, Timezone: Asia/Kolkata

df2=df[df.index.day==3]

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-39-23de8e1402be> in <module>()
----> 1 df2=df[df.index.day==3]

/usr/local/lib/python2.7/dist-packages/pandas/tseries/index.pyc in f(self)
 39             utc = _utc()
 40             if self.tz is not utc:
---> 41                 values = self._local_timestamps()
 42         return tslib.get_date_field(values, field)
 43     f.__name__ = name

/usr/local/lib/python2.7/dist-packages/pandas/tseries/index.pyc in     _local_timestamps(self)
399             values = self.asi8
400             indexer = values.argsort()
--> 401             result = tslib.tz_convert(values.take(indexer), utc, self.tz)
402 
403             n = len(indexer)

/usr/local/lib/python2.7/dist-packages/pandas/tslib.so in pandas.tslib.tz_convert     (pandas/tslib.c:20949)()

/usr/local/lib/python2.7/dist-packages/pandas/tslib.so in pandas.tslib._get_deltas (pandas/tslib.c:22754)()

/usr/local/lib/python2.7/dist-packages/pandas/tslib.so in pandas.tslib._get_utcoffset (pandas/tslib.c:11327)()

AttributeError: 'str' object has no attribute 'utcoffset'

A temporary fix which i did was to add seconds to data manually while creating the index.

@jreback
Copy link
Contributor

jreback commented Jun 3, 2013

Create a new index with the tz at the beginning;

This kind of assignment is not allowed (and should raise)

In [43]: df = DataFrame(randn(10,2),index=date_range('20130101',periods=10,tz='Asia/Kolkata'),columns=list('AB'))

In [44]: df
Out[44]: 
                                  A         B
2013-01-01 00:00:00+05:30  0.202436  0.409555
2013-01-02 00:00:00+05:30  0.537623  0.304266
2013-01-03 00:00:00+05:30  1.936195 -0.337695
2013-01-04 00:00:00+05:30  1.839265 -0.741898
2013-01-05 00:00:00+05:30 -0.126482 -0.599313
2013-01-06 00:00:00+05:30  0.228828 -0.252229
2013-01-07 00:00:00+05:30  0.909610 -0.728261
2013-01-08 00:00:00+05:30  0.487048 -0.056313
2013-01-09 00:00:00+05:30  0.582852  0.838708
2013-01-10 00:00:00+05:30 -1.220794  0.233096

In [45]: df.index
Out[45]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 00:00:00, ..., 2013-01-10 00:00:00]
Length: 10, Freq: D, Timezone: Asia/Kolkata

In [46]: df[df.index.day==3]
Out[46]: 
                                  A         B
2013-01-03 00:00:00+05:30  1.936195 -0.337695

@nipunbatra
Copy link
Contributor Author

What would be the best way to create an index in the way i am doing from epoch timestamps and adding tz information.

@jreback
Copy link
Contributor

jreback commented Jun 3, 2013

pd.to_datetime(stamps_in_ns,tz=your_tz)

make sure that you stamps are in nanoseconds (prob just multipy by 1e6)

Timestamp will take the nanoseconds and give you the correct timestamp

In [6]: Series([Timestamp('20130101')])
Out[6]: 
0   2013-01-01 00:00:00
dtype: datetime64[ns]

In [7]: Series([Timestamp('20130101')]).astype('int')
Out[7]: 
0    1356998400000000000
dtype: int64

@nipunbatra
Copy link
Contributor Author

Specifying timezone failed for me. Version 0.11.0

In [10]: t
Out[10]: 
array([1370445222, 1370445223, 1370445224, 1370445225, 1370445226,
    1370445227, 1370445228, 1370445229, 1370445230, 1370445231])

In [9]: pd.to_datetime((t*1e9).astype(int),tz='Asia/Kolkata')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ee97c24c5e36> in <module>()
----> 1 pd.to_datetime((t*1e9).astype(int),tz='Asia/Kolkata')

TypeError: to_datetime() got an unexpected keyword argument 'tz'

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

@nipunreddevil

from the cross-post

In [9]: pd.to_datetime((t*1e9).astype(int)).tz_localize('Asia/Kolkata')
Out[9]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-05 15:13:42, ..., 2013-06-05 15:13:51]
Length: 10, Freq: None, Timezone: Asia/Kolkata

@nipunbatra
Copy link
Contributor Author

Please see timezone is changing, but not the time.

In [11]: pd.to_datetime((t*1e9).astype(int)).tz_localize('Asia/Kolkata')
Out[11]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-05 15:13:42, ..., 2013-06-05 15:13:51]
Length: 10, Freq: None, Timezone: Asia/Kolkata

In [12]: pd.to_datetime((t*1e9).astype(int)).tz_localize('UTC')
Out[12]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-05 15:13:42, ..., 2013-06-05 15:13:51]
Length: 10, Freq: None, Timezone: UTC

In [13]: pd.to_datetime((t*1e9).astype(int)).tz_localize('EST')
Out[13]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-05 15:13:42, ..., 2013-06-05 15:13:51]
Length: 10, Freq: None, Timezone: EST

Contrast this with

In [15]: datetime.datetime.fromtimestamp(t[0])
Out[15]: datetime.datetime(2013, 6, 5, 20, 43, 42)

@jreback
Copy link
Contributor

jreback commented Jun 5, 2013

This is why we need a method to do this......
the epoch stamps are 'utc', but after the to_datetime they are not in any timezone
so first have to put them in utc (tz_localize), then convert

In [5]: pd.to_datetime((t*1e9).astype(int)).tz_localize('UTC').tz_convert('Asia/Kolkata')
Out[5]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-05 20:43:42, ..., 2013-06-05 20:43:51]
Length: 10, Freq: None, Timezone: Asia/Kolkata

@nipunbatra
Copy link
Contributor Author

Great!

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Mar 30, 2014
@jreback jreback modified the milestones: 0.16.0, 0.17.0 Jan 26, 2015
@jreback jreback modified the milestones: 0.18.1, Next Major Release Mar 1, 2016
@jreback
Copy link
Contributor

jreback commented Apr 25, 2016

Hmm, we should prevent this from being set. Changing this issue to an ERR one.

In [21]: df = DataFrame(randn(10,2),index=date_range('20130101',periods=10,tz='Asia/Kolkata'),columns=list('AB'))

In [22]: df
Out[22]: 
                                  A         B
2013-01-01 00:00:00+05:30  1.106337  0.257936
2013-01-02 00:00:00+05:30 -0.789945 -0.267006
2013-01-03 00:00:00+05:30 -1.002179 -1.844458
2013-01-04 00:00:00+05:30  0.638277  0.125499
2013-01-05 00:00:00+05:30 -1.733083  0.447643
2013-01-06 00:00:00+05:30  0.782098  0.906288
2013-01-07 00:00:00+05:30 -0.797574  0.418644
2013-01-08 00:00:00+05:30  1.550464 -0.957223
2013-01-09 00:00:00+05:30 -0.303808 -0.239765
2013-01-10 00:00:00+05:30  0.086044  0.485118

In [23]: df.index.tz
Out[23]: <DstTzInfo 'Asia/Kolkata' LMT+5:53:00 STD>

In [24]: df.index.tz = pytz.timezone('US/Eastern')

In [25]: df
Out[25]: 
                                  A         B
2012-12-31 13:30:00-05:00  1.106337  0.257936
2013-01-01 13:30:00-05:00 -0.789945 -0.267006
2013-01-02 13:30:00-05:00 -1.002179 -1.844458
2013-01-03 13:30:00-05:00  0.638277  0.125499
2013-01-04 13:30:00-05:00 -1.733083  0.447643
2013-01-05 13:30:00-05:00  0.782098  0.906288
2013-01-06 13:30:00-05:00 -0.797574  0.418644
2013-01-07 13:30:00-05:00  1.550464 -0.957223
2013-01-08 13:30:00-05:00 -0.303808 -0.239765
2013-01-09 13:30:00-05:00  0.086044  0.485118

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 25, 2016
@jreback jreback added Error Reporting Incorrect or improved errors from pandas Difficulty Intermediate and removed Bug Enhancement labels Apr 25, 2016
@jreback jreback changed the title ENH: allow tz setting, indexing fails after specifying timezone ERR: disallow direct tz setting Apr 25, 2016
@jreback jreback modified the milestones: 0.18.2, Next Major Release Jul 6, 2016
@jreback jreback removed this from the 0.18.2 milestone Jul 6, 2016
@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants