-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] when using id_vars in .melt()
, the string of the column name is broken into characters
#15758
Comments
Looks like the issue here is that the melt api expects As a workaround passing in the |
Thanks for the report. I have a PR to fix in this issue (#15765) and should hopefully be fixed in 24.06 |
closes #15758 Also fixes an inconsistency with pandas where `var_name` data was always a `Categorical` unlike pandas Authors: - Matthew Roeschke (https://github.com/mroeschke) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #15765
from pandas import DataFrame data = yf.download(tickers, start=start_date, end=end_date, progress=False) reset index to bring Date into the columns for the melt functiondata2= DataFrame(data).reset_index() print(data2.columns) data_types =data2.dtypes data_melted = data2.melt(id_vars= 'Date') data_melted Cell Outputs { File c:\Users\Oluwanifemi.Amao\Documents\demo-project\.venv\Lib\site-packages\pandas\core\frame.py:9942, in DataFrame.melt(self, id_vars, value_vars, var_name, value_name, col_level, ignore_index) File c:\Users\Oluwanifemi.Amao\Documents\demo-project\.venv\Lib\site-packages\pandas\core\reshape\melt.py:74, in melt(frame, id_vars, value_vars, var_name, value_name, col_level, ignore_index) KeyError: "The following id_vars or value_vars are not present in the DataFrame: ['Date']"" |
Describe the bug
When trying to create a melted dataframe with id_vars with a column name, for example, "index" i get the following error:
KeyError: "The following 'id_vars' are not present in the DataFrame: ['e', 'x', 'n', 'd', 'i']"
Steps/Code to reproduce bug
Outputs:
Expected behavior
Outputs:
Environment overview (please complete the following information)
Environment details
Please run and paste the output of the
cudf/print_env.sh
script here, to gather any other relevant environment detailsAdditional context
I also tried giving it a numerical column id, a single character, and a dataframe column for kicks. All failed with expected or similar errors. While it doesn't fail when using cudf.pandas, the fallback to pandas does dramatically slows down cudf.pandas to the point where it negates many of the speed ups in your workflow
The text was updated successfully, but these errors were encountered: