-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch for nightly test&bench #4840
Patch for nightly test&bench #4840
Conversation
viclafargue
commented
Jul 29, 2022
•
edited
Loading
edited
- Fix for MNMG TSVD (similar issue to cudaErrorContextIsDestroyed in RandomForest)
- Update the NVTX bench helper for the new nsys utility #4826
- MNMG Kmeans testing issue : modification of accuracy threshold
- MNMG KNNRegressor testing issue : modification of input for testing
- LabelEncoder documentation test issue : modification of pandas/cuDF display configuration
- RandomForest testing issue : adjust number of estimators to the number of workers
is this PR going into |
Just changed it to 22.10 |
@@ -263,6 +266,7 @@ cdef class ForestInference_impl(): | |||
self.handle = handle | |||
self.forest_data = forest_variant(<forest32_t> NULL) | |||
self.shape_str = NULL | |||
self.mr = get_current_device_resource() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is self.mr
used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Python models are storing rmm::device_vector
and other RMM objects or structs to containing these objects. These only have a pointer to their respective memory resources. Then depending on garbage collection, the memory resource (see memory_resource.pyx) can be released before objects depending on it for CUDA deallocation. This sometimes results in a segfault (visible when benchmarking for instance). Keeping a reference to the memory resource inside the model prevents its premature release (e.g. : in DeviceBuffer).
rerun tests |
2 similar comments
rerun tests |
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codeowner approval
@gpucibot merge |
- Fix for MNMG TSVD (similar issue to [cudaErrorContextIsDestroyed in RandomForest](rapidsai#2632 (comment))) - rapidsai#4826 - MNMG Kmeans testing issue : modification of accuracy threshold - MNMG KNNRegressor testing issue : modification of input for testing - LabelEncoder documentation test issue : modification of pandas/cuDF display configuration - RandomForest testing issue : adjust number of estimators to the number of workers Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4840