POC: 2D EAs via composition #27015

jbrockmendel · 2019-06-24T03:57:16Z

Plenty of kludges and linting errors in here, just want to push it to add composition to the discussion.

Instead of patching existing EAs, this introduces ReshapeableArray which just wraps other EAs, and implements reshape methods. EAs that do natively support 2D can set a _allows_2d = True and avoid being wrapped.

In the process of getting this passing, found a handful of new issues/bugs. Will try to push fixes for those independently.

pep8speaks · 2019-06-24T03:57:35Z

Hello @jbrockmendel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-06-27 15:47:41 UTC

pandas/core/arrays/categorical.py

TomAugspurger

So high-level, this makes Block.values a ReshapableArray, which is an ExtensionArray implementing the 2-D interface. Then a DataFrame is made up of a collection of Blocks whose values are all reshapable, either by being an ndarray, or an ExtensionArray with _allows_2d = True?

Is this your preferred approach for fixing Block.shape == Block.values.shape going forward?

TomAugspurger · 2019-06-24T13:12:42Z

pandas/core/algorithms.py

@@ -105,6 +105,9 @@ def _ensure_data(values, dtype=None):
        else:
            # Datetime
            from pandas import DatetimeIndex
+            from pandas.core.arrays import unwrap_reshapeable


Is this for something like factorize(EA)? Shouldn't the EA (or your wrapper) do this in _values_for_factorize?

values_for_factorize might do it, ill check. But at this point we dont necessarily have a EA, so need a conditional un-wrapper.

Making DTA validate that inputs are 1D can be done separately from the rest of this, which should resolve this particular part of the diff

pandas/core/algorithms.py

jbrockmendel · 2019-06-24T15:37:32Z

So high-level, this makes Block.values a ReshapableArray, which is an ExtensionArray implementing the 2-D interface. Then a DataFrame is made up of a collection of Blocks whose values are all reshapable, either by being an ndarray, or an ExtensionArray with _allows_2d = True?

Yes.

Is this your preferred approach for fixing Block.shape == Block.values.shape going forward?

No, this is my second-best. First-best would be to require EAs to handle the (1, N) case themselves, so we wouldn't need this extra layer. But I definitely prefer this to the metaclass approach, which I wasn't able to get working at all (MRO issues)

codecov · 2019-06-25T04:07:41Z

Codecov Report

Merging #27015 into master will decrease coverage by 50.09%.
The diff coverage is 51.26%.

@@            Coverage Diff             @@
##           master   #27015      +/-   ##
==========================================
- Coverage   91.99%    41.9%   -50.1%     
==========================================
  Files         180      181       +1     
  Lines       50774    51124     +350     
==========================================
- Hits        46711    21422   -25289     
- Misses       4063    29702   +25639

Flag	Coverage Δ
#multiple	`?`
#single	`41.9% <51.26%> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexing.py	`53.64% <ø> (-39.85%)`	⬇️
pandas/core/groupby/ops.py	`19.67% <0%> (-76.33%)`	⬇️
pandas/core/groupby/generic.py	`14.74% <0%> (-74.59%)`	⬇️
pandas/core/generic.py	`38.18% <0%> (-56.03%)`	⬇️
pandas/core/arrays/base.py	`59.89% <100%> (-39.55%)`	⬇️
pandas/core/dtypes/concat.py	`53.55% <100%> (-43.04%)`	⬇️
pandas/core/arrays/categorical.py	`42.09% <100%> (-53.84%)`	⬇️
pandas/core/internals/concat.py	`72.48% <100%> (-24.01%)`	⬇️
pandas/io/formats/format.py	`50.63% <100%> (-47.28%)`	⬇️
pandas/core/arrays/datetimelike.py	`41.49% <100%> (-56.44%)`	⬇️
... and 154 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ea2d08...6e4f207. Read the comment docs.

codecov · 2019-06-25T04:07:41Z

Codecov Report

Merging #27015 into master will decrease coverage by 50.12%.
The diff coverage is 53.89%.

@@             Coverage Diff             @@
##           master   #27015       +/-   ##
===========================================
- Coverage   92.03%   41.91%   -50.13%     
===========================================
  Files         180      181        +1     
  Lines       50714    51086      +372     
===========================================
- Hits        46675    21412    -25263     
- Misses       4039    29674    +25635

Flag	Coverage Δ
#multiple	`?`
#single	`41.91% <53.89%> (+0.04%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/indexing.py	`53.64% <ø> (-39.66%)`	⬇️
pandas/core/groupby/ops.py	`19.67% <0%> (-76.33%)`	⬇️
pandas/core/groupby/generic.py	`14.74% <0%> (-74.59%)`	⬇️
pandas/core/generic.py	`38.18% <0%> (-56.03%)`	⬇️
pandas/core/arrays/base.py	`59.89% <100%> (-39.55%)`	⬇️
pandas/core/dtypes/concat.py	`53.58% <100%> (-43.46%)`	⬇️
pandas/core/arrays/categorical.py	`42.09% <100%> (-53.84%)`	⬇️
pandas/core/internals/concat.py	`73.04% <100%> (-23.81%)`	⬇️
pandas/io/formats/format.py	`50.63% <100%> (-47.28%)`	⬇️
pandas/core/arrays/datetimelike.py	`41.36% <100%> (-56.57%)`	⬇️
... and 154 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1452e71...f6b8d23. Read the comment docs.

jbrockmendel added 3 commits June 23, 2019 20:46

one more approach to 2d EA

a2aa121

Cleanup

6a48654

cleanups

a28a9b2

Cleanup; tests passing

9fb3edb

jreback reviewed Jun 24, 2019

View reviewed changes

pandas/core/arrays/categorical.py Show resolved Hide resolved

TomAugspurger reviewed Jun 24, 2019

View reviewed changes

jbrockmendel mentioned this pull request Jun 24, 2019

TST: dont break ABCPandasArray checks #27014

Closed

cleanup [ci skip]

721be31

jbrockmendel commented Jun 24, 2019

View reviewed changes

pandas/core/algorithms.py Show resolved Hide resolved

jbrockmendel added 3 commits June 24, 2019 14:21

cleanup remove unnecessary

162ad63

Clean up unreachable cases

5f09070

Merge branch 'master' of https://github.com/pandas-dev/pandas into ea2d

462edab

jbrockmendel mentioned this pull request Jun 25, 2019

BUG: Restrict DTA to 1D #27027

Merged

4 tasks

jbrockmendel added 3 commits June 24, 2019 18:59

tests passing, including pytables

3e6dca3

Merge branch 'master' of https://github.com/pandas-dev/pandas into ea2d

1d6b0e0

remove unnecessary

6e4f207

jbrockmendel added 6 commits June 25, 2019 08:34

Merge branch 'master' of https://github.com/pandas-dev/pandas into ea2d

4c1d493

flake8 fixup, parquet kludge

8515dc6

Merge branch 'master' of https://github.com/pandas-dev/pandas into ea2d

1ca81bf

add unrelated config

ef51d9a

Cleanup

d4d0dbd

use templates for pass-through methods

00516b8

gfyoung added Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. labels Jun 26, 2019

jbrockmendel mentioned this pull request Jun 26, 2019

BUG: Categorical.copy deep kwarg #27024

Closed

4 tasks

cleanup

3bc559a

Merge branch 'master' of https://github.com/pandas-dev/pandas into ea2d

f6b8d23

jbrockmendel closed this Jul 1, 2019

jbrockmendel deleted the ea2d branch July 1, 2019 21:25

jbrockmendel mentioned this pull request Jul 3, 2019

EA: support basic 2D operations #27142

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: 2D EAs via composition #27015

POC: 2D EAs via composition #27015

jbrockmendel commented Jun 24, 2019

pep8speaks commented Jun 24, 2019 •

edited

Loading

TomAugspurger left a comment

TomAugspurger Jun 24, 2019

jbrockmendel Jun 24, 2019

jbrockmendel commented Jun 24, 2019

codecov bot commented Jun 25, 2019

codecov bot commented Jun 25, 2019 •

edited

Loading

POC: 2D EAs via composition #27015

POC: 2D EAs via composition #27015

Conversation

jbrockmendel commented Jun 24, 2019

pep8speaks commented Jun 24, 2019 • edited Loading

Comment last updated at 2019-06-27 15:47:41 UTC

TomAugspurger left a comment

Choose a reason for hiding this comment

TomAugspurger Jun 24, 2019

Choose a reason for hiding this comment

jbrockmendel Jun 24, 2019

Choose a reason for hiding this comment

jbrockmendel commented Jun 24, 2019

codecov bot commented Jun 25, 2019

Codecov Report

codecov bot commented Jun 25, 2019 • edited Loading

Codecov Report

pep8speaks commented Jun 24, 2019 •

edited

Loading

codecov bot commented Jun 25, 2019 •

edited

Loading