Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Groupby.sum modify _selected_obj? #32468

Closed
MarcoGorelli opened this issue Mar 5, 2020 · 2 comments
Closed

Should Groupby.sum modify _selected_obj? #32468

MarcoGorelli opened this issue Mar 5, 2020 · 2 comments
Labels
API Design Groupby Needs Discussion Requires discussion from core team before further action

Comments

@MarcoGorelli
Copy link
Member

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd                                                                                                                                                                            

In [2]: df = pd.DataFrame({"A": ["foo"] * 3 + ["bar"] * 3, "B": [1] * 6})                                                                                                                              

In [3]: g = df.groupby("A")                                                                                                                                                                            

In [4]: g._selected_obj                                                                                                                                                                                
Out[4]: 
     A  B
0  foo  1
1  foo  1
2  foo  1
3  bar  1
4  bar  1
5  bar  1

In [5]: g.sum()                                                                                                                                                                                        
Out[5]: 
     B
A     
bar  3
foo  3

In [6]: g._selected_obj                                                                                                                                                                                
Out[6]: 
   B
0  1
1  1
2  1
3  1
4  1
5  1

Problem description

Noticed this while working on #32332

Expected Output

I wasn't expecting this to have side-effects

Output of pd.show_versions()

In [7]: pd.show_versions()
/home/SERILOCAL/m.gorelli/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/fastparquet/dataframe.py:5: FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace.
from pandas.core.index import CategoricalIndex, RangeIndex, Index, MultiIndex

INSTALLED VERSIONS

commit : f7bed05
python : 3.7.6.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-88-generic
Version : #88-Ubuntu SMP Tue Feb 11 20:11:34 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 1.1.0.dev0+694.gf7bed0583
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 45.1.0.post20200119
Cython : 0.29.14
pytest : 5.3.4
hypothesis : 5.3.0
sphinx : 2.3.1
blosc : 1.8.3
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fastparquet : 0.3.2
gcsfs : None
matplotlib : 3.1.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.1
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
pyxlsb : None
s3fs : 0.4.0
scipy : 1.4.1
sqlalchemy : 1.3.12
tables : 3.6.1
tabulate : 0.8.6
xarray : 0.14.1
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.47.0

@jreback
Copy link
Contributor

jreback commented Mar 5, 2020

there are several open issue about this already and a possible PR

feel feee to link to this as a duplicate

@jbrockmendel jbrockmendel added API Design Groupby Needs Discussion Requires discussion from core team before further action labels Jun 7, 2020
@MarcoGorelli
Copy link
Member Author

sorry for having let this linger on - indeed, it was fixed by #35314

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Groupby Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants