BUG?: .at not working on object indexes containing some integers #19860

c-thiel · 2018-02-23T08:45:01Z

Version 0.22.0

Problem description

Using the .at - Method on an Index which contains Integers as well as str/objects raises an Error. This used to be possible using the .get_value()-Method. As .at is the designated successor (#15269) the same behaviour should be supported.
I also noticed that .get_value is approx. twice as fast as .at. Is there a specific reason to stick with .at? (see again #15269 for a speed comparison)

Code Sample

import pandas as pd
import numpy as np

data = np.random.randn(10, 5)
df = pd.DataFrame(data, columns=['a', 'b', 'c', 1, 2])
df.at[0, 1]

Raises:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/thielc/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1868, in __getitem__
    key = self._convert_key(key)
  File "/home/thielc/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1915, in _convert_key
    raise ValueError("At based indexing on an non-integer "
ValueError: At based indexing on an non-integer index can only have non-integer indexers

The text was updated successfully, but these errors were encountered:

gfyoung · 2018-02-23T09:11:19Z

@c-thiel : Thanks for reporting this! I'm a little unclear as to what's being reported. Is this a regression from a previous version (you said this used to be possible), or is this a general API inconsistency that you're referring to?

That aside, I do agree that the behavior does look strange.

jorisvandenbossche · 2018-02-23T10:29:13Z

I don't think it is a regression from a previous version (test with 0.16, already broken there), but it is possible with get_value. However, this is deprecated, and we say people should use .at instead. So then .at should work correctly on this as well.

The problem comes from here:

pandas/pandas/core/indexing.py

Lines 1907 to 1916 in 572476f

    
           for ax, i in zip(self.obj.axes, key): 
        
               if ax.is_integer(): 
        
                   if not is_integer(i): 
        
                       raise ValueError("At based indexing on an integer index " 
        
                                        "can only have integer indexers") 
        
               else: 
        
                   if is_integer(i): 
        
                       raise ValueError("At based indexing on an non-integer " 
        
                                        "index can only have non-integer " 
        
                                        "indexers")

where we do this check. But I agree the check looks to strict, as a mixed object index can indeed contain integers as well.

Welcome to try to fix this (eg try removing this check, and see if some tests fail due to that).

jreback · 2018-02-23T11:08:42Z

yeah I think prob ok to remove the else check; this ultimately goes thru .loc so indexing verfication can occur there

jreback · 2018-02-23T11:09:12Z

@c-thiel indexing with mixed dtype indexes is simply going to be slow generally.

c-thiel · 2018-02-23T12:21:11Z

@jorisvandenbossche Yes, this is what I was reffering to.

@jreback : Regarding Performance, at is still much faster than loc, especially when setting values. The setting-performance for single values is the main reason for me using the set_value and get_value functions. But also the get_value, set_value functions are twice as fast as at :

import pandas as pd
import numpy as np
import time

c = ['a', 'b', 'c', 'd', 'e']
data = np.random.rand(10000, 5)
df = pd.DataFrame(data, columns=c)
rows = np.random.randint(0, 9999, (100000,))
columns = np.random.choice(c, (100000,))

t = time.time()
for row, column in zip(rows, columns):
    a = df.get_value(row, column)    
print(f'get_value: {time.time()-t}')

t = time.time()
for row, column in zip(rows, columns):
    a = df.at[row, column]    
print(f'at: {time.time()-t}')

t = time.time()
for row, column in zip(rows, columns):
    a = df.loc[row, column]    
print(f'loc: {time.time()-t}')

t = time.time()
for row, column in zip(rows, columns):
    df.at[row, column] = 4
print(f'set at: {time.time()-t}')

t = time.time()
for row, column in zip(rows, columns):
    df.loc[row, column] = 5
print(f'set loc: {time.time()-t}')

t = time.time()
for row, column in zip(rows, columns):
    df.set_value(row, column, 4)
print(f'set_value: {time.time()-t}')

get_value: 0.257692813873291
at: 0.52744460105896
loc: 0.7349758148193359
set at: 0.687880277633667
set loc: 11.664336204528809
set_value: 0.3008086681365967

jreback · 2018-02-23T12:26:18Z

@c-thiel setting individual values in a loop is non-idiomatic. set_value/get_value were deprecated because they didn't properly handle any edge cases nor had any type safetly whatsoever. Correct is much much better then wrong but slightly faster.

.at incorrectly disallowed the use of integer indexes when a mixed index was used disabled fallback indexing when a mixed index is used

…dev#22436)

gfyoung added the Indexing Related to indexing on series/frames, not to indexes themselves label Feb 23, 2018

jreback added Dtype Conversions Unexpected or buggy dtype conversions Difficulty Intermediate labels Feb 23, 2018

jreback added this to the Next Major Release milestone Feb 23, 2018

mariuspot added a commit to mariuspot/pandas that referenced this issue Aug 21, 2018

BUG pandas-dev#19860 Corrected use of mixed indexes with .at

f47ba8e

.at incorrectly disallowed the use of integer indexes when a mixed index was used disabled fallback indexing when a mixed index is used

mariuspot mentioned this issue Aug 21, 2018

BUG #19860 Corrected use of mixed indexes with .at #22436

Merged

4 tasks

jreback modified the milestones: Contributions Welcome, 0.24.0 Aug 21, 2018

mariuspot added a commit to mariuspot/pandas that referenced this issue Aug 22, 2018

BUG pandas-dev#19860 Corrected use of mixed indexes with .at

1b8579a

.at incorrectly disallowed the use of integer indexes when a mixed index was used disabled fallback indexing when a mixed index is used

mariuspot added a commit to mariuspot/pandas that referenced this issue Aug 22, 2018

BUG pandas-dev#19860 Corrected use of mixed indexes with .at

3f89c39

.at incorrectly disallowed the use of integer indexes when a mixed index was used disabled fallback indexing when a mixed index is used

jreback closed this as completed in #22436 Aug 29, 2018

jreback pushed a commit that referenced this issue Aug 29, 2018

BUG #19860 Corrected use of mixed indexes with .at (#22436)

bf67634

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

BUG pandas-dev#19860 Corrected use of mixed indexes with .at (pandas-…

5d1af2f

…dev#22436)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG?: .at not working on object indexes containing some integers #19860

BUG?: .at not working on object indexes containing some integers #19860

c-thiel commented Feb 23, 2018 •

edited

Loading

gfyoung commented Feb 23, 2018

jorisvandenbossche commented Feb 23, 2018

jreback commented Feb 23, 2018

jreback commented Feb 23, 2018

c-thiel commented Feb 23, 2018

jreback commented Feb 23, 2018

BUG?: .at not working on object indexes containing some integers #19860

BUG?: .at not working on object indexes containing some integers #19860

Comments

c-thiel commented Feb 23, 2018 • edited Loading

Problem description

Code Sample

gfyoung commented Feb 23, 2018

jorisvandenbossche commented Feb 23, 2018

jreback commented Feb 23, 2018

jreback commented Feb 23, 2018

c-thiel commented Feb 23, 2018

jreback commented Feb 23, 2018

c-thiel commented Feb 23, 2018 •

edited

Loading