Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Joining MultiIndex with Int64Index does not find common columns #9823

Closed
Tracked by #9815
bdice opened this issue Dec 2, 2021 · 0 comments · Fixed by #9830
Closed
Tracked by #9815

[BUG] Joining MultiIndex with Int64Index does not find common columns #9823

bdice opened this issue Dec 2, 2021 · 0 comments · Fixed by #9830
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@bdice
Copy link
Contributor

bdice commented Dec 2, 2021

Describe the bug
Joining MultiIndex with Int64Index does not find common columns.

Steps/Code to reproduce bug
This doctest (#9815) fails:

>>> import cudf
>>> lhs = cudf.DataFrame(
... {"a":[2, 3, 1], "b":[3, 4, 2]}).set_index(['a', 'b']
... ).index
>>> lhs
MultiIndex([(2, 3),
(3, 4),
(1, 2)],
names=['a', 'b'])
>>> rhs = cudf.DataFrame({"a":[1, 4, 3]}).set_index('a').index
>>> rhs
Int64Index([1, 4, 3], dtype='int64', name='a')
>>> lhs.join(rhs, how='inner')
MultiIndex([(3, 4),
(1, 2)],
names=['a', 'b'])

The common column name a between the MultiIndex lhs and Int64Index rhs is not found. The following exception is raised:

Failed example:
    lhs.join(rhs, how='inner')
Exception raised:
    Traceback (most recent call last):
      File "/home/bdice/code/compose/etc/conda/cuda_11.5/envs/rapids/lib/python3.8/doctest.py", line 1336, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest BaseIndex.join[5]>", line 1, in <module>
        lhs.join(rhs, how='inner')
      File "/home/bdice/code/cudf/python/cudf/cudf/core/_base_index.py", line 1166, in join
        output = lhs._merge(rhs, how=how, on=on, sort=sort)
      File "/home/bdice/code/cudf/python/cudf/cudf/core/frame.py", line 3819, in _merge
        return merge_cls(
      File "/home/bdice/code/cudf/python/cudf/cudf/core/join/join.py", line 83, in __init__
        self._validate_merge_params(
      File "/home/bdice/code/cudf/python/cudf/cudf/core/join/join.py", line 413, in _validate_merge_params
        raise ValueError("No common columns to perform merge on")
    ValueError: No common columns to perform merge on

Expected behavior
Pandas matches the expected behavior of the docstring:

>>> lhs = pd.DataFrame({"a": [2, 3, 1], "b": [3, 4, 2]}).set_index(["a", "b"]).index
>>> lhs
MultiIndex([(2, 3),
            (3, 4),
            (1, 2)],
           names=['a', 'b'])
>>> rhs = pd.DataFrame({"a": [1, 4, 3]}).set_index("a").index
>>> rhs
Int64Index([1, 4, 3], dtype='int64', name='a')
>>> lhs.join(rhs)
MultiIndex([(2, 3),
            (3, 4),
            (1, 2)],
           names=['a', 'b'])
>>> lhs.join(rhs, how="inner")
MultiIndex([(3, 4),
            (1, 2)],
           names=['a', 'b'])
@bdice bdice added bug Something isn't working Python Affects Python cuDF API. labels Dec 2, 2021
@bdice bdice mentioned this issue Dec 2, 2021
9 tasks
@shwina shwina self-assigned this Dec 2, 2021
@shwina shwina removed their assignment Dec 2, 2021
@rapids-bot rapids-bot bot closed this as completed in #9830 Dec 3, 2021
rapids-bot bot pushed a commit that referenced this issue Dec 3, 2021
rapids-bot bot pushed a commit that referenced this issue Jan 15, 2022
This PR adds doctests and resolves #9513. Several issues were found by running doctests that have now been resolved:

- [x] #9821
- [x] #9822
- [x] #9823
- [x] #9824
- [x] #9825
- [x] #9826
- [x] #9827
- [x] #9828 (workaround by deleting doctests)
- [x] #9829

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #9815
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants