-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep subclassing in apply #19823
Keep subclassing in apply #19823
Conversation
When generating new objects from apply, it calls the self.obj._constructor, self.obj._constructior_sliced instead of DataFrame, Series; keeping subclassing.
Hello @jaumebonet! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on February 23, 2018 at 08:49 Hours UTC |
Thanks - needs some tests, and it looks like it broke something for sparse, otherwise your approach seems reasonable at a glance. |
Hi @chris-b1, |
can you add some tests in pandas/tests/frame/test_sub_class (and for Series as well) |
also pls add a release note, other enhancements section is ok |
Codecov Report
@@ Coverage Diff @@
## master #19823 +/- ##
==========================================
+ Coverage 91.58% 91.61% +0.02%
==========================================
Files 150 150
Lines 48908 48906 -2
==========================================
+ Hits 44792 44804 +12
+ Misses 4116 4102 -14
Continue to review full report at Codecov.
|
Hi @jreback, As per your suggestion, I also added a bullet point in the other enhancement section. |
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -295,6 +295,7 @@ Other Enhancements | |||
- ``IntervalIndex.astype`` now supports conversions between subtypes when passed an ``IntervalDtype`` (:issue:`19197`) | |||
- :class:`IntervalIndex` and its associated constructor methods (``from_arrays``, ``from_breaks``, ``from_tuples``) have gained a ``dtype`` parameter (:issue:`19262`) | |||
- Added :func:`SeriesGroupBy.is_monotonic_increasing` and :func:`SeriesGroupBy.is_monotonic_decreasing` (:issue:`17015`) | |||
- :func:``DataFrame.apply`` keeps provides the specified ``Series`` subclass when `DataFrame._constructor_sliced`` is defined (:issue:`19822`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use a private property here, just say when Series
and ``DataFrame``` are subclassed
['Mary', 'Bo', 'weight', 150]], | ||
columns=['first', 'last', 'variable', 'value']) | ||
|
||
df.apply(lambda x: check_row_subclass(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to check the result of this using tm.assert_frame_equal
, just use a simple example of an applied function, testing both returning a list, e.g. lambda x: [1, 2, 3]
and a sub-classed series
pandas/tests/frame/test_subclass.py
Outdated
[1, 2, 3], | ||
[1, 2, 3]]) | ||
|
||
expected3 = DesignSeries([[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this needs to be updated.
pandas/tests/frame/test_subclass.py
Outdated
expected3 = tm.SubclassedSeries([ | ||
[1, 2, 3], | ||
[1, 2, 3], | ||
[1, 2, 3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than numbering these, can you just do one after the other and use
result =
expected =
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -295,8 +295,10 @@ Other Enhancements | |||
- ``IntervalIndex.astype`` now supports conversions between subtypes when passed an ``IntervalDtype`` (:issue:`19197`) | |||
- :class:`IntervalIndex` and its associated constructor methods (``from_arrays``, ``from_breaks``, ``from_tuples``) have gained a ``dtype`` parameter (:issue:`19262`) | |||
- Added :func:`SeriesGroupBy.is_monotonic_increasing` and :func:`SeriesGroupBy.is_monotonic_decreasing` (:issue:`17015`) | |||
- :func:``DataFrame.apply`` keeps the specified ``Series`` subclass when ``Series`` and ``DataFrame`` subclasses are defined (:issue:`19822`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this not so clear. It is about the Series type that is passed to the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
When apply
ing a function to a subclassed DataFrame
I realised I could not call the specific methods of the subclassed Series
that this DataFrame
was supossed to generate through _constructor_sliced
. As far as I understood, the subclassing of the Series
was lost during apply
.
With these changes the subclass is now kept inside apply
, so method of the expected Series
subclass can be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, can you try to clarify the text a bit? Eg something like "For subclassed DataFrames, DataFrame.apply will now preserve the Series subclass (if defined) when passing the data to the applied function" (but adapt as you like)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! I like it as this! :)
will change it!
thanks! |
When generating new objects on apply, it calls
self.obj._constructor
andself.obj._constructior_sliced
instead of:DataFrame
andSeries
; keeping subclassing.git diff upstream/master -u -- "*.py" | flake8 --diff
DataFrame.apply