[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests #40325

jorisvandenbossche · 2021-03-09T14:37:02Z

Extracting part of the test changes of #39578. To keep the size a bit more limited, I am splitting the PRs, so this PR is doing pandas/tests/indexing. The pandas/tests/frame/indexing are handled in #40323

jorisvandenbossche · 2021-03-09T16:59:19Z

So one of the main causes for differences in behaviour between BlockManager and ArrayManager, is that ArrayManager always takes the "split" path in the indexing code. And, the split path has different behaviour for several cases compared to the single block path. Which is the case for both BlockManager and ArrayManager, but because many of the tests are using simple dataframes, the default tests with BlockManager don't catch this.

For example, setting values with .loc[:, col] still overwrites the column instead of updating inplace when possible.

I still need to open issues for those several cases that fail for BlockManager as well.

jbrockmendel · 2021-03-09T18:59:24Z

So one of the main causes for differences in behaviour between BlockManager and ArrayManager, is that ArrayManager always takes the "split" path in the indexing code

FWIW ive got a branch nearing readiness that makes setitem_with_indexer take the split_path iff it is 2D, which will hopefully make keeping these behaviors in sync somewhat easier.

pandas/tests/indexing/test_indexing.py

jbrockmendel · 2021-03-11T16:33:00Z

pandas/tests/indexing/test_loc.py

        expected = DataFrame({"A": ser})
        tm.assert_frame_equal(df, expected)

+        # with mixed dataframe


separate test

jbrockmendel · 2021-03-12T17:25:30Z

pandas/tests/indexing/test_iloc.py

    @pytest.mark.parametrize("box", [array, Series])
-    def test_iloc_setitem_ea_inplace(self, frame_or_series, box):
+    def test_iloc_setitem_ea_inplace(self, frame_or_series, box, using_array_manager):


why both the skip and the fixture? id expect one or the other

because I already rewrote the test so that it could pass for ArrayManager (accessing .values for a DataFrame will never work to get a view of the data with AM, so that needs to be rewritten), it's only still failing for other reasons.

jbrockmendel · 2021-03-16T21:44:26Z

pandas/tests/indexing/test_loc.py

+        df.loc[[4, 3, 5], "A"] = np.array([1, 2, 3], dtype="int64")
+        # TODO with "split" path we still actually overwrite the column
+        # and therefore don't take the order of the indexer into account
+        ser = Series([1, 2, 3], index=[3, 5, 4], dtype="int64")


this is weird right? why wouldnt we expect ser = Series([2, 3, 1], index=[3, 5, 4])?

(actual question: is my intuition wrong, or is this a bug?)

Yes, your intuition is correct, this is clearly a bug (that wasn't tested because the original test (still above) was using a homogeneous dataframe). That's what the TODO meant to explain.
But I now opened an issue for it (#40480, it's actually a regression on master, apparently), will update the comment with a link to that issue.

And made it an xfail instead of testing the wrong result

jbrockmendel · 2021-03-16T21:45:11Z

one question, LGTM pending green

…as-dev#40325)

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests

a7a9de2

jorisvandenbossche added Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation labels Mar 9, 2021

jorisvandenbossche requested a review from jbrockmendel March 9, 2021 14:37

jorisvandenbossche mentioned this pull request Mar 9, 2021

[ArrayManager] TST: run (+fix/skip) pandas/tests/series/indexing tests #40326

Merged

jbrockmendel reviewed Mar 9, 2021

View reviewed changes

pandas/tests/indexing/test_indexing.py Show resolved Hide resolved

jorisvandenbossche added 3 commits March 10, 2021 10:55

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

9cebd40

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

439da6f

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

1031367

jbrockmendel reviewed Mar 11, 2021

View reviewed changes

jorisvandenbossche added 4 commits March 12, 2021 10:16

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

1bc23bc

split test

0882d54

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

e5f7f2f

typo

7c56048

jbrockmendel reviewed Mar 12, 2021

View reviewed changes

jorisvandenbossche added 3 commits March 15, 2021 08:35

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

2a2259b

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

89c59f1

wrong dimensional indexer

afc4371

jbrockmendel reviewed Mar 16, 2021

View reviewed changes

jorisvandenbossche added 3 commits March 17, 2021 11:00

Merge remote-tracking branch 'upstream/master' into am-indexing-tests-2

8a8e897

update comment

700407a

change into xfail

70a3e25

jorisvandenbossche mentioned this pull request Mar 17, 2021

BUG: setting values with array in mixed DataFrame disregards order of indexer #40480

Closed

jorisvandenbossche merged commit 8945a42 into pandas-dev:master Mar 17, 2021

jorisvandenbossche deleted the am-indexing-tests-2 branch March 17, 2021 12:02

This was referenced Apr 22, 2021

WIP [ArrayManager] API: setitem to set new columns / loc+iloc to update inplace #39578

Closed

REF: move check for disallowed bool arithmetic ops out of numexpr-related expressions.py #41161

Merged

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests (pand…

05461ec

…as-dev#40325)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests #40325

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests #40325

jorisvandenbossche commented Mar 9, 2021

jorisvandenbossche commented Mar 9, 2021

jbrockmendel commented Mar 9, 2021

jbrockmendel Mar 11, 2021

jbrockmendel Mar 12, 2021

jorisvandenbossche Mar 15, 2021

jbrockmendel Mar 16, 2021

jorisvandenbossche Mar 17, 2021

jorisvandenbossche Mar 17, 2021

jbrockmendel commented Mar 16, 2021

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests #40325

[ArrayManager] TST: run (+fix/skip) pandas/tests/indexing tests #40325

Conversation

jorisvandenbossche commented Mar 9, 2021

jorisvandenbossche commented Mar 9, 2021

jbrockmendel commented Mar 9, 2021

jbrockmendel Mar 11, 2021

Choose a reason for hiding this comment

jbrockmendel Mar 12, 2021

Choose a reason for hiding this comment

jorisvandenbossche Mar 15, 2021

Choose a reason for hiding this comment

jbrockmendel Mar 16, 2021

Choose a reason for hiding this comment

jorisvandenbossche Mar 17, 2021

Choose a reason for hiding this comment

jorisvandenbossche Mar 17, 2021

Choose a reason for hiding this comment

jbrockmendel commented Mar 16, 2021