Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird Datetime Behaviour #3002

Closed
justinvf-zz opened this issue Mar 10, 2013 · 6 comments
Closed

Weird Datetime Behaviour #3002

justinvf-zz opened this issue Mar 10, 2013 · 6 comments
Labels
Milestone

Comments

@justinvf-zz
Copy link

I am seeing odd behavior of dates being converted to datetime64[ns] times when that is inappropriate.

    In [272]: file = StringIO("xxyyzz20100101PIE\nxxyyzz20100101GUM\nxxyyww20090101EGG\nfoofoo20080909PIE")

    In [273]: df = pd.read_fwf(file, widths=[6,8,3], names=["person_id", "dt", "food"], parse_dates=["dt"])

    In [274]: df
    Out[274]: 
      person_id                  dt food
    0    xxyyzz 2010-01-01 00:00:00  PIE
    1    xxyyzz 2010-01-01 00:00:00  GUM
    2    xxyyww 2009-01-01 00:00:00  EGG
    3    foofoo 2008-09-09 00:00:00  PIE

Everything looks good. However:

    In [275]: df.dt.value_counts()
    Out[275]: 
    1970-01-17 00:00:00    2
    1970-01-18 00:00:00    1
    1970-01-25 08:00:00    1

    In [276]: df.dt.dtype
    Out[276]: dtype('datetime64[ns]')

The type is being stored as epoch nanoseconds internally, but usually displaying usually as the dates I want. But then everything is getting messed up when we drop down to numpy land for things like value_counts. What is going on? Is this a bug or am I making mistakes?

In [280]: pd.__version__
Out[280]: '0.10.1'

In [281]: np.__version__
Out[281]: '1.6.1'
@wesm
Copy link
Member

wesm commented Mar 10, 2013

This is a bug which still exists in master. Marking for 0.11

@jreback
Copy link
Contributor

jreback commented Mar 10, 2013

@wesm see my comments on PR, if enought interest, prob should create a TimeDelta64Index (just a fancy version of Int64)

@wesm
Copy link
Member

wesm commented Mar 10, 2013

Or just TimedeltaIndex I guess. Should have a scalar type analogous to Timestamp also. Not urgent but at some point...

@jreback
Copy link
Contributor

jreback commented Mar 10, 2013

yep, i had that problem in #2990, ideally return the scalar for min/max (at the end)

@jreback
Copy link
Contributor

jreback commented Mar 10, 2013

@justinvf this is merged in - pls update to master and try out
thanks for the report

@justinvf-zz
Copy link
Author

You guys are quick! Thanks so much for the bugfix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants