Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple aggregations with DatetimeIndexResamplerGroupby raises #18970

Closed
TomAugspurger opened this issue Dec 28, 2017 · 7 comments
Closed

Multiple aggregations with DatetimeIndexResamplerGroupby raises #18970

TomAugspurger opened this issue Dec 28, 2017 · 7 comments
Labels
Datetime Datetime data dtype Groupby Resample resample method Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@TomAugspurger
Copy link
Contributor

On master, doing a groupby().resample.agg with multiple aggfuncs fails.

In [13]: df = pd.DataFrame({"A": pd.to_datetime(['2015', '2017']), "B": [1, 1]})

In [14]: df
Out[14]:
           A  B
0 2015-01-01  1
1 2017-01-01  1

In [15]: df.set_index("A").groupby([0, 0]).resample("AS")
Out[15]: DatetimeIndexResamplerGroupby [freq=<YearBegin: month=1>, axis=0, closed=left, label=left, convention=e, base=0]

In [16]: df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-16-5f1c18a8d4ac> in <module>()
----> 1 df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/resample.py in aggregate(self, arg, *args, **kwargs)
    339
    340         self._set_binner()
--> 341         result, how = self._aggregate(arg, *args, **kwargs)
    342         if result is None:
    343             result = self._groupby_and_aggregate(arg,

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    538             return self._aggregate_multiple_funcs(arg,
    539                                                   _level=_level,
--> 540                                                   _axis=_axis), None
    541         else:
    542             result = None

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate_multiple_funcs(self, arg, _level, _axis)
    583                 try:
    584                     colg = self._gotitem(col, ndim=1, subset=obj[col])
--> 585                     results.append(colg.aggregate(arg))
    586                     keys.append(col)
    587                 except (TypeError, DataError):

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/resample.py in aggregate(self, arg, *args, **kwargs)
    339
    340         self._set_binner()
--> 341         result, how = self._aggregate(arg, *args, **kwargs)
    342         if result is None:
    343             result = self._groupby_and_aggregate(arg,

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    538             return self._aggregate_multiple_funcs(arg,
    539                                                   _level=_level,
--> 540                                                   _axis=_axis), None
    541         else:
    542             result = None

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate_multiple_funcs(self, arg, _level, _axis)
    582             for col in obj:
    583                 try:
--> 584                     colg = self._gotitem(col, ndim=1, subset=obj[col])
    585                     results.append(colg.aggregate(arg))
    586                     keys.append(col)

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _gotitem(self, key, ndim, subset)
    675                        for attr in self._attributes])
    676         self = self.__class__(subset,
--> 677                               groupby=self._groupby[key],
    678                               parent=self,
    679                               **kwargs)

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in __getitem__(self, key)
    241         if self._selection is not None:
    242             raise Exception('Column(s) {selection} already selected'
--> 243                             .format(selection=self._selection))
    244
    245         if isinstance(key, (list, tuple, ABCSeries, ABCIndexClass,

Exception: Column(s) B already selected

A single aggfunc is OK.

@TomAugspurger TomAugspurger added this to the Next Major Release milestone Dec 28, 2017
@jreback
Copy link
Contributor

jreback commented Dec 28, 2017

this is the same as #15072, but slightly different end effects.

@jreback jreback added Resample resample method Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 28, 2017
@TomAugspurger
Copy link
Contributor Author

Whoops, yep. I'll add this example to that issue.

@SeedyL
Copy link

SeedyL commented Jan 16, 2019

Reference to this issue was added to #15702 as noted above - i.e. "this is the same as #15072, but slightly different end effects".
#15702 was fixed in #21323 but this issue still seems to exist. I'm using 0.23.4 and still have this problem.

df.groupby('City').resample('D').agg('mean') # works fine

df.groupby('City').resample('D').agg(['mean', 'count']) # doesn't
...
Exception: Column(s) City already selected

Any chance someone can have a look at this?

@jreback
Copy link
Contributor

jreback commented Jan 16, 2019

this is in not yet released 0.24.0

@SeedyL
Copy link

SeedyL commented Jan 16, 2019

Thanks. I got my dates and major releases confused.

@TrentonBush
Copy link

When I run the same code as the OP I get a RecursionError. I'm on pandas 1.4.2, python 3.9

In [40]: pd.__version__
Out[40]: '1.4.2'

In [41]: df = pd.DataFrame({"A": pd.to_datetime(['2015', '2017']), "B": [1, 1]})

In [42]: df
Out[42]:
           A  B
0 2015-01-01  1
1 2017-01-01  1

In [43]: df.set_index("A").groupby([0, 0]).resample("AS")
Out[43]: <pandas.core.resample.DatetimeIndexResamplerGroupby object at 0x7feef4038640>

In [44]: df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
Input In [44], in <cell line: 1>()
----> 1 df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/resample.py:347, in Resampler.aggregate(self, func, *args, **kwargs)
    338 @doc(
    339     _shared_docs["aggregate"],
    340     see_also=_agg_see_also_doc,
   (...)
    344 )
    345 def aggregate(self, func=None, *args, **kwargs):
--> 347     result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
    348     if result is None:
    349         how = func

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:171, in Apply.agg(self)
    168     return self.agg_dict_like()
    169 elif is_list_like(arg):
    170     # we require a list, but not a 'str'
--> 171     return self.agg_list_like()
    173 if callable(arg):
    174     f = com.get_cython_func(arg)

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:375, in Apply.agg_list_like(self)
    368 try:
    369     # Capture and suppress any warnings emitted by us in the call
    370     # to agg below, but pass through any warnings that were
    371     # generated otherwise.
    372     # This is necessary because of https://bugs.python.org/issue29672
    373     # See GH #43741 for more details
    374     with warnings.catch_warnings(record=True) as record:
--> 375         new_res = colg.aggregate(arg)
    376     if len(record) > 0:
    377         match = re.compile(depr_nuisance_columns_msg.format(".*"))

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/resample.py:347, in Resampler.aggregate(self, func, *args, **kwargs)
    338 @doc(
    339     _shared_docs["aggregate"],
    340     see_also=_agg_see_also_doc,
   (...)
    344 )
    345 def aggregate(self, func=None, *args, **kwargs):
--> 347     result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
    348     if result is None:
    349         how = func

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:171, in Apply.agg(self)
    168     return self.agg_dict_like()
    169 elif is_list_like(arg):
    170     # we require a list, but not a 'str'
--> 171     return self.agg_list_like()
    173 if callable(arg):
    174     f = com.get_cython_func(arg)

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:375, in Apply.agg_list_like(self)
    368 try:
    369     # Capture and suppress any warnings emitted by us in the call
    370     # to agg below, but pass through any warnings that were
    371     # generated otherwise.
    372     # This is necessary because of https://bugs.python.org/issue29672
    373     # See GH #43741 for more details
    374     with warnings.catch_warnings(record=True) as record:
--> 375         new_res = colg.aggregate(arg)
    376     if len(record) > 0:
    377         match = re.compile(depr_nuisance_columns_msg.format(".*"))

    [... skipping similar frames: Apply.agg at line 171 (985 times), Resampler.aggregate at line 347 (985 times), Apply.agg_list_like at line 375 (984 times)]

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:375, in Apply.agg_list_like(self)
    368 try:
    369     # Capture and suppress any warnings emitted by us in the call
    370     # to agg below, but pass through any warnings that were
    371     # generated otherwise.
    372     # This is necessary because of https://bugs.python.org/issue29672
    373     # See GH #43741 for more details
    374     with warnings.catch_warnings(record=True) as record:
--> 375         new_res = colg.aggregate(arg)
    376     if len(record) > 0:
    377         match = re.compile(depr_nuisance_columns_msg.format(".*"))

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/resample.py:347, in Resampler.aggregate(self, func, *args, **kwargs)
    338 @doc(
    339     _shared_docs["aggregate"],
    340     see_also=_agg_see_also_doc,
   (...)
    344 )
    345 def aggregate(self, func=None, *args, **kwargs):
--> 347     result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
    348     if result is None:
    349         how = func

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:171, in Apply.agg(self)
    168     return self.agg_dict_like()
    169 elif is_list_like(arg):
    170     # we require a list, but not a 'str'
--> 171     return self.agg_list_like()
    173 if callable(arg):
    174     f = com.get_cython_func(arg)

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/apply.py:367, in Apply.agg_list_like(self)
    365 indices = []
    366 for index, col in enumerate(selected_obj):
--> 367     colg = obj._gotitem(col, ndim=1, subset=selected_obj.iloc[:, index])
    368     try:
    369         # Capture and suppress any warnings emitted by us in the call
    370         # to agg below, but pass through any warnings that were
    371         # generated otherwise.
    372         # This is necessary because of https://bugs.python.org/issue29672
    373         # See GH #43741 for more details
    374         with warnings.catch_warnings(record=True) as record:

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/indexing.py:961, in _LocationIndexer.__getitem__(self, key)
    959     if self._is_scalar_access(key):
    960         return self.obj._get_value(*key, takeable=self._takeable)
--> 961     return self._getitem_tuple(key)
    962 else:
    963     # we by definition only have the 0th axis
    964     axis = self.axis or 0

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/indexing.py:1458, in _iLocIndexer._getitem_tuple(self, tup)
   1456 def _getitem_tuple(self, tup: tuple):
-> 1458     tup = self._validate_tuple_indexer(tup)
   1459     with suppress(IndexingError):
   1460         return self._getitem_lowerdim(tup)

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/indexing.py:769, in _LocationIndexer._validate_tuple_indexer(self, key)
    767 for i, k in enumerate(key):
    768     try:
--> 769         self._validate_key(k, i)
    770     except ValueError as err:
    771         raise ValueError(
    772             "Location based indexing can only have "
    773             f"[{self._valid_types}] types"
    774         ) from err

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/indexing.py:1344, in _iLocIndexer._validate_key(self, key, axis)
   1343 def _validate_key(self, key, axis: int):
-> 1344     if com.is_bool_indexer(key):
   1345         if hasattr(key, "index") and isinstance(key.index, Index):
   1346             if key.index.inferred_type == "integer":

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/common.py:133, in is_bool_indexer(key)
    105 def is_bool_indexer(key: Any) -> bool:
    106     """
    107     Check whether `key` is a valid boolean indexer.
    108 
   (...)
    131         and convert to an ndarray.
    132     """
--> 133     if isinstance(key, (ABCSeries, np.ndarray, ABCIndex)) or (
    134         is_array_like(key) and is_extension_array_dtype(key.dtype)
    135     ):
    136         if key.dtype == np.object_:
    137             key = np.asarray(key)

File ~/miniconda3/envs/gsod/lib/python3.9/site-packages/pandas/core/dtypes/generic.py:45, in create_pandas_abc_type.<locals>._check(cls, inst)
     43 @classmethod  # type: ignore[misc]
     44 def _check(cls, inst) -> bool:
---> 45     return getattr(inst, attr, "_typ") in comp

RecursionError: maximum recursion depth exceeded while calling a Python object

@alexlokhov
Copy link

Same here, getting a recursion error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Groupby Resample resample method Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

5 participants