Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing numRows in the ColumnarBatch created in GpuBringBackToHost #11364

Closed
jihoonson opened this issue Aug 19, 2024 · 4 comments · Fixed by #11365
Closed

[BUG] Missing numRows in the ColumnarBatch created in GpuBringBackToHost #11364

jihoonson opened this issue Aug 19, 2024 · 4 comments · Fixed by #11365
Assignees
Labels
bug Something isn't working

Comments

@jihoonson
Copy link
Collaborator

Describe the bug
GpuBringBackToHost creates the ColumnarBatch without numRows, which will be set to 0 in this case. This can confuse any logic relying on the numRows metric during the query processing, such as adaptive query optimizer. The ColumnarBatch returned would be always regarded as if it were an empty batch.

Expected behavior
The right number of rows should be retrieved from the hostColumns and then set for the ColumnarBatch before it is returned.

@jihoonson jihoonson added bug Something isn't working ? - Needs Triage Need team to review and classify labels Aug 19, 2024
@jihoonson jihoonson self-assigned this Aug 19, 2024
@revans2
Copy link
Collaborator

revans2 commented Aug 19, 2024

@jihoonson great catch. Could you also check other places that we create a ColumnarBatch to be sure there are no other issues like this? The API is kind of bad because you can create a ColumnarBatch and then set the number of rows afterwards.

@jihoonson
Copy link
Collaborator Author

@revans2 I have checked other places, and did not see any. I agree that this API does not have the best design. I wonder if we should do something about it, such as adding safe wrappers, e.g., rapids.ColumnarBatches.empty() and rapids.ColumnarBatches.create(ColumnVector[] columns, int numRows), and forbidding unsafe APIs.

@jihoonson
Copy link
Collaborator Author

Regarding forbidding the use of unsafe APIs, I meant that we could use the forbidden API checker to catch them during the build.

@jihoonson
Copy link
Collaborator Author

Filed #11369.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants