Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: displayed dtype of series inferred from shown subset instead of series #11594

Closed
2 tasks
jorisvandenbossche opened this issue Nov 13, 2015 · 5 comments
Closed
2 tasks
Labels
Bug Master Tracker High level tracker for similar issues Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@jorisvandenbossche
Copy link
Member

All the same issue

In [1]: import datetime

In [2]: s = pd.Series([datetime.datetime(2012, 1, 1)]*10 + [datetime.datetime(1012,1,2)] + [datetime.datetime(2012, 1, 3)]*10)

In [8]: s
Out[8]:
0     2012-01-01 00:00:00
1     2012-01-01 00:00:00
2     2012-01-01 00:00:00
3     2012-01-01 00:00:00
4     2012-01-01 00:00:00
5     2012-01-01 00:00:00
6     2012-01-01 00:00:00
7     2012-01-01 00:00:00
8     2012-01-01 00:00:00
9     2012-01-01 00:00:00
10    1012-01-02 00:00:00
11    2012-01-03 00:00:00
12    2012-01-03 00:00:00
13    2012-01-03 00:00:00
14    2012-01-03 00:00:00
15    2012-01-03 00:00:00
16    2012-01-03 00:00:00
17    2012-01-03 00:00:00
18    2012-01-03 00:00:00
19    2012-01-03 00:00:00
20    2012-01-03 00:00:00
dtype: object

In [9]: pd.options.display.max_rows = 8

In [10]: s
Out[10]:
0    2012-01-01
1    2012-01-01
2    2012-01-01
3    2012-01-01
        ...
17   2012-01-03
18   2012-01-03
19   2012-01-03
20   2012-01-03
dtype: datetime64[ns]

In [11]: s.dtype
Out[11]: dtype('O')

So in some cases, you think you have a datetime64 series, but actually you don't and eg dt properties don't work which can lead to quite some confusion ... :-)

@jorisvandenbossche jorisvandenbossche added Bug Output-Formatting __repr__ of pandas objects, to_string labels Nov 13, 2015
@jreback
Copy link
Contributor

jreback commented Nov 13, 2015

you have an out-of-range date which forces it back to object.

if you had done this via pd.to_datetime it would raise.

In [4]: s = pd.to_datetime([datetime.datetime(2012, 1, 1)]*10 + [datetime.datetime(1012,1,2)] + [datetime.datetime(2012, 1, 3)]*10)
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1012-01-02 00:00:00

@jreback
Copy link
Contributor

jreback commented Nov 13, 2015

you want something like this maybe?

AttributeError: Can only use .dt accessor with datetimelike values, you have [dtype->object]

@jorisvandenbossche
Copy link
Member Author

Well, yes, that is the point, I don't have a datetime64 series (because of the out of range datetime), but the series display indicates I do have a datetime64 series.
Which makes it a bit difficult to see that something went wrong in my datetime conversion.

I know that I can convert it, but this is just about the displaying thing (the actual case where I ran into it were some excel data that apparantly had some wrong data. But I didn't directly spot this because the displayed told me everything was fine.

@jreback
Copy link
Contributor

jreback commented Nov 13, 2015

@jorisvandenbossche ahh I see, we should be displaying the original dtype, not the sliced ones, hmm (and shouldn't be inferring on those), but I think we may be automatically doing that....

@jreback
Copy link
Contributor

jreback commented Feb 23, 2016

So .iloc with a list is re-inferring the dtype when slicing, while [[]] is not. hmm. I think we should not be re-inferring here. The .concat will follow naturally if we don't reinfer.

In [2]: s = Series([pd.Timedelta(1,unit='s'),'foo'])

In [3]: s
Out[3]: 
0    0 days 00:00:01
1                foo
dtype: object

In [5]: s[[0]]
Out[5]: 
0    0 days 00:00:01
dtype: object

In [6]: s[[1]]
Out[6]: 
1    foo
dtype: object

In [7]: s.iloc[[0]]
Out[7]: 
0   00:00:01
dtype: timedelta64[ns]

In [8]: s.iloc[[1]]
Out[8]: 
1    foo
dtype: object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Master Tracker High level tracker for similar issues Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants