You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In Pandas, setting a value to loc with a fresh column name inserts a new column to the dataframe. In cuDF, this is a hard error.
Steps/Code to reproduce bug
>>> import pandas as pd
>>> x = pd.DataFrame()
>>> x.loc[:, "a"] = [1,2,3]
>>> x
a
0 1
1 2
2 3
>>> import cudf
>>> x = cudf.DataFrame()
>>> x.loc[:, "a"] = [1,2,3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/indexing.py", line 186, in __setitem__
return self._setitem_tuple_arg(key, value)
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/contextlib.py", line 74, in inner
return func(*args, **kwds)
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/indexing.py", line 370, in _setitem_tuple_arg
columns = self._get_column_selection(key[1])
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/indexing.py", line 376, in _get_column_selection
return self._df._get_columns_by_label(arg)
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/frame.py", line 475, in _get_columns_by_label
new_data = self._data.select_by_label(labels)
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/column_accessor.py", line 219, in select_by_label
return self._select_by_label_grouped(key)
File "/home/nfs/wonchanl/anaconda3/envs/rapids-tpcx-bb/lib/python3.7/site-packages/cudf/core/column_accessor.py", line 267, in _select_by_label_grouped
result = self._grouped_data[key]
KeyError: 'a'
The text was updated successfully, but these errors were encountered:
closes#7628
This PR adds support to setting a column in the dataframe when the provided column name is a new column name. The specified rows can be of a single row label, a collection of row labels, or slices. The value-to-set can be column-like object or scalar. E.g. you can now do this:
```
>>> x = cudf.DataFrame()
>>> x.loc[:, "a"] = [1, 2, 3] # set a new column with list
>>> x
a
0 1
1 2
2 3
>>> x.loc[[1, 2], "b"] = ["abc", "cba"] # set part of the new column with list
>>> x
a b
0 1 <NA>
1 2 abc
2 3 cba
>>> x.loc[:, "c"] = 5 # set the new column to the scalar
>>> x
a b c
0 1 <NA> 5
1 2 abc 5
2 3 cba 5
```
Authors:
- Michael Wang (https://github.com/isVoid)
Approvers:
- GALI PREM SAGAR (https://github.com/galipremsagar)
URL: #8012
Describe the bug
In Pandas, setting a value to
loc
with a fresh column name inserts a new column to the dataframe. In cuDF, this is a hard error.Steps/Code to reproduce bug
The text was updated successfully, but these errors were encountered: