-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Getting error when jointly encoding single-hot and multi-hot categ columns #1639
Comments
This issue is blocking the addition of a filtering step to the multi-stage example notebook/RecSys demo |
I am currently getting the following behavior: In [1]: import cudf
...: import nvtabular as nvt
...: train = cudf.DataFrame(
...: {
...: "C1": [1, 3, 3, 4, 3, 1] *2,
...: "C2": [10, 11, 12, 10, 11, 12] *2,
...: "C3": [[1, 3], [1, 5], [4, 2, 1], [1, 2, 3], [1], [3,4]] *2,
...: }
...: )
...:
...: cat_features = [["C1", "C3"], "C2"] >> nvt.ops.Categorify()
...:
...: train_dataset = nvt.Dataset(train)
...:
...: workflow = nvt.Workflow(cat_features)
...: workflow.fit_transform(train_dataset).compute()
Out[1]:
C2 C1 C3
0 1 1 [1, 2]
1 2 2 [1, 5]
2 3 2 [3, 4, 1]
3 1 3 [1, 4, 2]
4 2 2 [1]
5 3 1 [2, 3]
6 1 1 [1, 2]
7 2 2 [1, 5]
8 3 2 [3, 4, 1]
9 1 3 [1, 4, 2]
10 2 2 [1]
11 3 1 [2, 3] I will close the issue, because this result looks correct to me. However, feel free to re-open if there is still a bug that I am missing (cc @rnyak @karlhigley) |
For future reference, this was fixed by #1685 . And available in the 22.10 Merlin release. ( |
I've opened an issue in cudf here rapidsai/cudf#12083 in case the |
Describe the bug
I would like to jointly encode single and multi-hot categorical columns but I am getting the following error:
`
Steps/Code to reproduce bug
You can run the code below to repro the error:
Expected behavior
A clear and concise description of what you expected to happen.
Environment details (please complete the following information):
docker pull
&docker run
commands usedI am using
merlin-tensorflow:22.06
container with the latest branches pulled from all libraries.The text was updated successfully, but these errors were encountered: