Skip to content

Commit

Permalink
BUG Merge not behaving correctly when having MultiIndex with a sing…
Browse files Browse the repository at this point in the history
…le level (pandas-dev#53215)

* fix merge when MultiIndex with single level

* resolved conversations

* fixed code style
  • Loading branch information
Charlie-XIAO authored and Yi Wei committed May 19, 2023
1 parent 50d5796 commit 0e6803a
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 13 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ Reshaping
^^^^^^^^^
- Bug in :func:`crosstab` when ``dropna=False`` would not keep ``np.nan`` in the result (:issue:`10772`)
- Bug in :meth:`DataFrame.agg` and :meth:`Series.agg` on non-unique columns would return incorrect type when dist-like argument passed in (:issue:`51099`)
- Bug in :meth:`DataFrame.merge` not merging correctly when having ``MultiIndex`` with single level (:issue:`52331`)
- Bug in :meth:`DataFrame.stack` losing extension dtypes when columns is a :class:`MultiIndex` and frame contains mixed dtypes (:issue:`45740`)
- Bug in :meth:`DataFrame.transpose` inferring dtype for object column (:issue:`51546`)
- Bug in :meth:`Series.combine_first` converting ``int64`` dtype to ``float64`` and losing precision on very large integers (:issue:`51764`)
Expand Down
17 changes: 4 additions & 13 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -2267,23 +2267,14 @@ def _get_no_sort_one_missing_indexer(
def _left_join_on_index(
left_ax: Index, right_ax: Index, join_keys, sort: bool = False
) -> tuple[Index, npt.NDArray[np.intp] | None, npt.NDArray[np.intp]]:
if len(join_keys) > 1:
if not (
isinstance(right_ax, MultiIndex) and len(join_keys) == right_ax.nlevels
):
raise AssertionError(
"If more than one join key is given then "
"'right_ax' must be a MultiIndex and the "
"number of join keys must be the number of levels in right_ax"
)

if isinstance(right_ax, MultiIndex):
left_indexer, right_indexer = _get_multiindex_indexer(
join_keys, right_ax, sort=sort
)
else:
jkey = join_keys[0]

left_indexer, right_indexer = _get_single_indexer(jkey, right_ax, sort=sort)
left_indexer, right_indexer = _get_single_indexer(
join_keys[0], right_ax, sort=sort
)

if sort or len(left_ax) != len(left_indexer):
# if asked to sort or there are 1-to-many matches
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/reshape/merge/test_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -2797,3 +2797,16 @@ def test_merge_datetime_different_resolution(tzinfo):
)
result = df1.merge(df2, on="t")
tm.assert_frame_equal(result, expected)


def test_merge_multiindex_single_level():
# GH #52331
df = DataFrame({"col": ["A", "B"]})
df2 = DataFrame(
data={"b": [100]},
index=MultiIndex.from_tuples([("A",), ("C",)], names=["col"]),
)
expected = DataFrame({"col": ["A", "B"], "b": [100, np.nan]})

result = df.merge(df2, left_on=["col"], right_index=True, how="left")
tm.assert_frame_equal(result, expected)

0 comments on commit 0e6803a

Please sign in to comment.