[BUG] Linear Regression: Estimator predictions deviate from scikit-learn results for some inputs #4963

csadorf · 2022-11-01T11:46:27Z

Describe the bug

The linear regression predictions between the cuml and the scikit-learn implementation deviate for certain input values.

Steps/Code to reproduce bug

Download lg_cuml_sk_deviation_minimal_example_input.zip
Execute:

import numpy as np
from sklearn.linear_model import LinearRegression
from cuml import LinearRegression as cuLinearRegression

npzfile = np.load("lg_cuml_sk_deviation_minimal_example_input.zip")
X_train, X_test, y_train, y_test = npzfile = [npzfile[f] for f in npzfile.files]

estimator = LinearRegression()
estimator.fit(X_train, y_train)
sk_predict = estimator.predict(X_test)

estimator = cuLinearRegression()
estimator.fit(X_train, y_train)
cu_predict = estimator.predict(X_test)

absolute_deviation = np.sum(np.abs((cu_predict - sk_predict)))
print(absolute_deviation)

Expected behavior

The deviation between regression predictions should be below the expected tolerance.

Environment details (please complete the following information):

Environment location: Docker
Linux Distro/Architecture: Ubuntu 20.04 amd64
GPU Model/Driver: V100-SMX2 495.29.05
CUDA: 11.5
Method of cuDF & cuML install: from source

Additional context

The issue was discovered as part of the extension of tests with hypothesis (#4952).

Until rapidsai#4963 is resolved.

csadorf added bug Something isn't working ? - Needs Triage Need team to review and classify Linear Models labels Nov 1, 2022

csadorf added a commit to csadorf/cuml that referenced this issue Nov 7, 2022

Implement test_linear_regression_model_default_generalized as stop-gap.

af41aa2

Until rapidsai#4963 is resolved.

csadorf self-assigned this Dec 14, 2022

csadorf mentioned this issue Dec 15, 2022

Expand hypothesis testing to all linear models #4974

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Linear Regression: Estimator predictions deviate from scikit-learn results for some inputs #4963

[BUG] Linear Regression: Estimator predictions deviate from scikit-learn results for some inputs #4963

csadorf commented Nov 1, 2022 •

edited

Loading

[BUG] Linear Regression: Estimator predictions deviate from scikit-learn results for some inputs #4963

[BUG] Linear Regression: Estimator predictions deviate from scikit-learn results for some inputs #4963

Comments

csadorf commented Nov 1, 2022 • edited Loading

csadorf commented Nov 1, 2022 •

edited

Loading