Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Cannot index frozenset elements from a pd.Series or pd.DataFrame #35747

Closed
3 tasks
jolespin opened this issue Aug 15, 2020 · 6 comments · Fixed by #36147
Closed
3 tasks

BUG: Cannot index frozenset elements from a pd.Series or pd.DataFrame #35747

jolespin opened this issue Aug 15, 2020 · 6 comments · Fixed by #36147
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jolespin
Copy link

jolespin commented Aug 15, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

print(pd.__version__)
# 1.1.0

# Create DataFrame
data = {0: {frozenset({'Otu000010', 'Otu000505'}): 'white',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'white',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'},
 14: {frozenset({'Otu000010', 'Otu000505'}): 'white',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'white',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'},
 28: {frozenset({'Otu000010', 'Otu000505'}): 'red',
  frozenset({'Otu000067', 'Otu000073'}): 'white',
  frozenset({'Otu000151', 'molar'}): 'blue',
  frozenset({'Otu000380', 'etec_stp'}): 'white',
  frozenset({'Otu000281', 'ghrp'}): 'white'}}
df = pd.DataFrame(data)

# Get first item in index
id_query = df.index[0]

# Grab a column
vector = df[14]

# Index the vector using query
vector[id_query]


# ---------------------------------------------------------------------------
# KeyError                                  Traceback (most recent call last)
# <ipython-input-145-493fc06aeb40> in <module>
#      24 
#      25 # Index the vector using query
# ---> 26 vector[id_query]

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
#     906             return self._get_values(key)
#     907 
# --> 908         return self._get_with(key)
#     909 
#     910     def _get_with(self, key):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/series.py in _get_with(self, key)
#     946 
#     947         # handle the dup indexing case GH#4246
# --> 948         return self.loc[key]
#     949 
#     950     def _get_values_tuple(self, key):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in __getitem__(self, key)
#     877 
#     878             maybe_callable = com.apply_if_callable(key, self.obj)
# --> 879             return self._getitem_axis(maybe_callable, axis=axis)
#     880 
#     881     def _is_scalar_access(self, key: Tuple):

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
#    1097                     raise ValueError("Cannot index with multidimensional key")
#    1098 
# -> 1099                 return self._getitem_iterable(key, axis=axis)
#    1100 
#    1101             # nested tuple slicing

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
#    1035 
#    1036         # A collection of keys
# -> 1037         keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
#    1038         return self.obj._reindex_with_indexers(
#    1039             {axis: [keyarr, indexer]}, copy=True, allow_dups=True

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
#    1252             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
#    1253 
# -> 1254         self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
#    1255         return keyarr, indexer
#    1256 

# ~/anaconda3/envs/soothsayer5_env/lib/python3.8/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
#    1296             if missing == len(indexer):
#    1297                 axis_name = self.obj._get_axis_name(axis)
# -> 1298                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
#    1299 
#    1300             # We (temporarily) allow for some missing keys with .loc, except in

# KeyError: "None of [Index(['Otu000010', 'Otu000505'], dtype='object')] are in the [index]"

Problem description

In previous versions, I was able to use frozenset objects as the elements of the index. These are great objects to have for network analysis where I use as edges in my pd.Series and pd.DataFrame

Expected Output

I should be able to index using these objects.

Output of pd.show_versions()

pandas v1.1.0

@jolespin jolespin added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 15, 2020
@dsaxton
Copy link
Member

dsaxton commented Aug 16, 2020

Thanks @jolespin, might be the same underlying issue as #35534

@dsaxton dsaxton added Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 16, 2020
@jolespin
Copy link
Author

jolespin commented Aug 16, 2020 via email

@jolespin
Copy link
Author

Is there an install I can use with the update in the referenced Issue? If not, I can just downgrade my version.

@dsaxton
Copy link
Member

dsaxton commented Aug 16, 2020

From the discussion it looks like the plan is to revert the change that caused this regression and perhaps backport to 1.1.0, but I don't know when that will happen.

cc @simonjayhawkins

@simonjayhawkins
Copy link
Member

I think that there is agreement on restoring the 1.0.5 behaviour, but that is not the same as reverting the PR cc @jbrockmendel

@simonjayhawkins
Copy link
Member

Thanks @jolespin, might be the same underlying issue as #35534

same PR #31112 responsible for this regression

f19035d is the first bad commit
commit f19035d
Author: jbrockmendel [email protected]
Date: Sat Jan 18 07:49:03 2020 -0800

REF: fix calls to Index.get_value (#31112)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants