Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int64 Index in 0.17 typcasts '0' string to integer #11836

Closed
AlbertDeFusco opened this issue Dec 13, 2015 · 2 comments
Closed

int64 Index in 0.17 typcasts '0' string to integer #11836

AlbertDeFusco opened this issue Dec 13, 2015 · 2 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@AlbertDeFusco
Copy link

Here's what I got. Is this expected behavior? I could not find a reference for this functionality in the release notes.

0.16

conda create -n pd pandas=0.16 python=3.4 ipython

In 0.16 the dtype of the index changes to object when adding a row with '0'.

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: data = np.random.random(10)                                                                                     
In [4]: m=pd.Series(data)
In [5]: m[0]=0.4444
In [6]: m.index
Out[6]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')
In [7]: m['0']=0.5555
In [8]: m.index
Out[8]: Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, '0'], dtype='object')

0.17.1

conda create -n pd pandas=0.17 python=3.4 ipython

In 0.17.1 The index dtype does not change, but typecasts to the integer 0. Repeated assignment at the integer 0 index appends more 0s to the index.

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: data = np.random.random(10)                                                                                     
In [4]: m=pd.Series(data)
In [5]: m[0]=0.4444
In [6]: m.index
Out[6]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')
In [7]: m['0']=0.5555
In [8]: m.index
Out [8]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0], dtype='int64')
In [9]: m['0']=0.6666
In [10]: m.index
Out[10]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0], dtype='int64')
In [11]: m[0]
Out[11]: 
0    0.4444
0    0.5555
0    0.6666
dtype: float64
@jreback
Copy link
Contributor

jreback commented Dec 13, 2015

I suppose this should work to be consistent. Might be a bit tricky. This is a failing of Index.insert where the dtype of the inserted element is inferred using the dtype of the current index, which is a tricky thing; you almost alway want to do this because it will raise if its not a compatible element, except when it happens that a string version is DIRECTLY convertible (in this case to a numpy array).

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions Difficulty Intermediate labels Dec 13, 2015
@jreback jreback added this to the Next Major Release milestone Dec 13, 2015
@DavidMertz
Copy link

It's worse still though, because m[0]=123 will modify an existing row, but m["0"]=123 will add more rows. So even in the crazy world of PHP-style type casting, the behavior is different depending on the type of the thing that gets cast.

@jreback jreback modified the milestones: 0.18.0, Next Major Release Feb 6, 2016
jreback added a commit to jreback/pandas that referenced this issue Feb 12, 2016
…nal setting,

and raise a TypeError, xref pandas-dev#4892

BUG: index type coercion when setting with an integer-like

closes pandas-dev#11836
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants