-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Sporadic OLS pytest fail in test_linear_regression_model_default #1739
Labels
Comments
dantegd
added
bug
Something isn't working
? - Needs Triage
Need team to review and classify
labels
Feb 24, 2020
Have not seen this in ages, believed to be closed out. |
wphicks
changed the title
[BUG] Sporadic OLS pytest fail in CUDA 10.2
[BUG] Sporadic OLS pytest fail
Feb 4, 2021
Observed in both CUDA 10.1 and 10.2 at least. @dantegd believes that this may be a Volta/Pascal but not Turing/Ampere issue |
wphicks
changed the title
[BUG] Sporadic OLS pytest fail
[BUG] Sporadic OLS pytest fail in test_linear_regression_model_default
Feb 4, 2021
Now reproduced on Volta/ CUDA 11 and with both |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Have seen it only twice so far, both times in CI environments as far as I remember, but the single GPU OLS test
test_linear_regression_model_default
has had a very odd failure in CUDA 10.2.Steps/Code to reproduce bug
Run OLS pytests (
python/cuml/test/test_linear_regression.py
). It might be needed to run multiple times, and potentially in Docker containers that reproduce the CI environment. Error looks like this:Environment details (please complete the following information):
Additional context
Link to example log: https://gpuci.gpuopenanalytics.com/job/docker/job/tests/job/docker-test-cuml/283/CUDA_VERSION=10.2,LINUX_VERSION=centos7,PYTHON_VERSION=3.7/testReport/junit/cuml.test/test_linear_model/test_linear_regression_model_default_float64_/
The text was updated successfully, but these errors were encountered: