-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_metadata items of subclassed pd.Series are not propagated into corresponding SubclassedDataFrame #32860
Comments
Maybe related to #24685 |
@johannes-mueller the metadata item lives on the Series object. But, when putting a Series in a DataFrame, pandas does not actually store the columns as Series objects, but as arrays in a internal data structure (the BlockManager). So when accessing a column of SubclassedDataFrame, a new SubclassedSeries is created (using the So what you want it right now not possible. Some options:
|
attrs doesn't currently propagate through, since |
Issue #19850 seems related (thanks for that metadata label), which is about keeping around the metadata when going the other way, from a SubclassedDataFrame to a SubclassedSeries. Adapting the workaround posted there #19850 (comment) solved this problem for me, although I created my SubclassedDataFrame using the First, add the
Second, call
Then use:
to obtain what you expected:
However, for me, combining this workaround with initializing with a dict as you did still behaves unexpectedly. It stopped the AttributeError, but sets the property to None:
gives:
|
@Flix6x Somehow, returning a function instead of a class fails at line 396: pandas/pandas/core/reshape/concat.py Lines 394 to 398 in 8f6ec1e
with error AttributeError: 'function' object has no attribute '_get_axis_number'
Any workarounds? |
Issue happens on repr when number of lines is high enough and concat gets called. One workaround is to specify the attribute to function
It seems that this is the only place where a function is called from |
@samuelduchesne I would say that is a bug: we generally should not assume that It seems this was introduced as part of trying to avoid a DataFrame import in #34837. But so a PR to remove this usage of |
@jorisvandenbossche Not sure what the usage should be replaced with in an eventual PR. Not familiar with |
I opened a PR to fix this |
Code Sample
Problem description
_metadata
items ofpd.Series
subclasses are not propagated when theSubclassedSeries
object is put into aSubclassedDataFrame
. I would expectmyprop
to be available in the newSubclassedDataFrame
.Expected Output
Output of
pd.show_versions()
pandas : 1.0.2
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.0.0.post20200309
Cython : 0.29.15
pytest : 5.4.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : 0.15.0
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
numba : None
The text was updated successfully, but these errors were encountered: