Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Constructing a DataFrame with a multi-level column does not work #13741

Closed
shwina opened this issue Jul 24, 2023 · 1 comment · Fixed by #13772
Closed

[BUG] Constructing a DataFrame with a multi-level column does not work #13741

shwina opened this issue Jul 24, 2023 · 1 comment · Fixed by #13772
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@shwina
Copy link
Contributor

shwina commented Jul 24, 2023

columns = cudf.MultiIndex.from_tuples([('A', 'one'), ('A', 'two')])
df = cudf.DataFrame(np.random.randn(2, 2), columns=columns)
df

Output:

  (A, one)  (A, two)
0 -1.004870  1.253556
1  0.176658  0.856762

Expected output:

          A
        one       two
0 -0.926630  0.718061
1 -0.920203 -0.074759
@shwina shwina added bug Something isn't working Python Affects Python cuDF API. labels Jul 24, 2023
@wence-
Copy link
Contributor

wence- commented Jul 26, 2023

Maybe this?

diff --git a/python/cudf/cudf/core/dataframe.py b/python/cudf/cudf/core/dataframe.py
index 0fe8949090..cc57091539 100644
--- a/python/cudf/cudf/core/dataframe.py
+++ b/python/cudf/cudf/core/dataframe.py
@@ -722,6 +722,8 @@ class DataFrame(IndexedFrame, Serializable, GetAttrGetItemMixin):
 
         if dtype:
             self._data = self.astype(dtype)._data
+        # Fix up
+        self._data.multiindex = isinstance(columns, pd.MultiIndex)
 
     @_cudf_nvtx_annotate
     def _init_from_series_list(self, data, columns, index):

@galipremsagar galipremsagar self-assigned this Jul 26, 2023
rapids-bot bot pushed a commit that referenced this issue Jul 28, 2023
This PR preserves column names in various APIs by retaining `self._data._level_names` and also calculating when to preserve the column names.
Fixes: #13741, #13740

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ashwin Srinath (https://github.com/shwina)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #13772
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants