-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: 2D EAs via composition #27015
POC: 2D EAs via composition #27015
Conversation
Hello @jbrockmendel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-06-27 15:47:41 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So high-level, this makes Block.values
a ReshapableArray, which is an ExtensionArray implementing the 2-D interface. Then a DataFrame is made up of a collection of Blocks whose values are all reshapable, either by being an ndarray, or an ExtensionArray with _allows_2d = True
?
Is this your preferred approach for fixing Block.shape == Block.values.shape
going forward?
@@ -105,6 +105,9 @@ def _ensure_data(values, dtype=None): | |||
else: | |||
# Datetime | |||
from pandas import DatetimeIndex | |||
from pandas.core.arrays import unwrap_reshapeable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for something like factorize(EA)
? Shouldn't the EA (or your wrapper) do this in _values_for_factorize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values_for_factorize might do it, ill check. But at this point we dont necessarily have a EA, so need a conditional un-wrapper.
Making DTA validate that inputs are 1D can be done separately from the rest of this, which should resolve this particular part of the diff
Yes.
No, this is my second-best. First-best would be to require EAs to handle the (1, N) case themselves, so we wouldn't need this extra layer. But I definitely prefer this to the metaclass approach, which I wasn't able to get working at all (MRO issues) |
Codecov Report
@@ Coverage Diff @@
## master #27015 +/- ##
==========================================
- Coverage 91.99% 41.9% -50.1%
==========================================
Files 180 181 +1
Lines 50774 51124 +350
==========================================
- Hits 46711 21422 -25289
- Misses 4063 29702 +25639
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #27015 +/- ##
===========================================
- Coverage 92.03% 41.91% -50.13%
===========================================
Files 180 181 +1
Lines 50714 51086 +372
===========================================
- Hits 46675 21412 -25263
- Misses 4039 29674 +25635
Continue to review full report at Codecov.
|
Plenty of kludges and linting errors in here, just want to push it to add composition to the discussion.
Instead of patching existing EAs, this introduces ReshapeableArray which just wraps other EAs, and implements reshape methods. EAs that do natively support 2D can set a
_allows_2d = True
and avoid being wrapped.In the process of getting this passing, found a handful of new issues/bugs. Will try to push fixes for those independently.