[REVIEW] Fix Stochastic Gradient Descent Example #3136

tylerjthomas9 · 2020-11-13T18:22:28Z

The example that is currently in the docs does not run. dtype, penalty, lrate, loss are not defined. This new version sets the default values for the parameters of cumlSGD, and copies Mini Batch SGD Regression's dtype for pred_data['col1'], pred_data['col2']. When running this example, I also got slightly different values for the output, so these were also updated.

Old example:

import numpy as np
import cudf
from cuml.solvers import SGD as cumlSGD
X = cudf.DataFrame()
X['col1'] = np.array([1,1,2,2], dtype = np.float32)
X['col2'] = np.array([1,2,2,3], dtype = np.float32)
y = cudf.Series(np.array([1, 1, 2, 2], dtype=np.float32))
pred_data = cudf.DataFrame()
pred_data['col1'] = np.asarray([3, 2], dtype=dtype)
pred_data['col2'] = np.asarray([5, 5], dtype=dtype)
cu_sgd = cumlSGD(learning_rate=lrate, eta0=0.005, epochs=2000,
                fit_intercept=True, batch_size=2,
                tol=0.0, penalty=penalty, loss=loss)
cu_sgd.fit(X, y)
cu_pred = cu_sgd.predict(pred_data).to_array()
print(" cuML intercept : ", cu_sgd.intercept_)
print(" cuML coef : ", cu_sgd.coef_)
print("cuML predictions : ", cu_pred)

New example:

import numpy as np
import cudf
from cuml.solvers import SGD as cumlSGD
X = cudf.DataFrame()
X['col1'] = np.array([1,1,2,2], dtype=np.float32)
X['col2'] = np.array([1,2,2,3], dtype=np.float32)
y = cudf.Series(np.array([1, 1, 2, 2], dtype=np.float32))
pred_data = cudf.DataFrame()
pred_data['col1'] = np.asarray([3, 2], dtype=np.float32)
pred_data['col2'] = np.asarray([5, 5], dtype=np.float32)
cu_sgd = cumlSGD(learning_rate='constant', eta0=0.005, epochs=2000,
                fit_intercept=True, batch_size=2,
                tol=0.0, penalty='none', loss='squared_loss')
cu_sgd.fit(X, y)
cu_pred = cu_sgd.predict(pred_data).to_array()
print(" cuML intercept : ", cu_sgd.intercept_)
print(" cuML coef : ", cu_sgd.coef_)
print("cuML predictions : ", cu_pred)

The example that is currently in the docs does not run. dtype, penalty, lrate, loss are not defined. This new version sets the default values for the parameters of cumlSGD, and copies Mini Batch SGD Regression's dtype for pred_data['col1'], pred_data['col2']. When running this example, I also got slightly different values for the output, so these were also updated.

GPUtester · 2020-11-13T18:22:30Z

Can one of the admins verify this patch?

GPUtester · 2020-11-13T18:22:30Z

Can one of the admins verify this patch?

JohnZed

Looks good! Thank you for the fix!

JohnZed · 2020-11-14T00:04:59Z

ok to test

GPUtester · 2020-11-14T00:05:24Z

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

codecov-io · 2020-11-14T02:58:17Z

Codecov Report

Merging #3136 (742f071) into branch-0.17 (77da916) will decrease coverage by 0.39%.
The diff coverage is n/a.

@@               Coverage Diff               @@
##           branch-0.17    #3136      +/-   ##
===============================================
- Coverage        70.57%   70.17%   -0.40%     
===============================================
  Files              197      197              
  Lines            15461    15670     +209     
===============================================
+ Hits             10911    10997      +86     
- Misses            4550     4673     +123

Impacted Files	Coverage Δ
python/cuml/solvers/sgd.pyx	`92.93% <ø> (ø)`
...on/cuml/_thirdparty/sklearn/preprocessing/_data.py	`58.48% <0.00%> (-4.97%)`	⬇️
python/cuml/thirdparty_adapters/adapters.py	`88.88% <0.00%> (+0.44%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77da916...742f071. Read the comment docs.

dantegd · 2020-11-15T19:24:44Z

@tylerjthomas9 thanks for the PR! The only thing missing is an entry in the changelog so that CI runs and we can merge it. Thanks!

tylerjthomas9 · 2020-11-15T21:10:11Z

@tylerjthomas9 thanks for the PR! The only thing missing is an entry in the changelog so that CI runs and we can merge it. Thanks!

Oops, completely forgot about that. Just added the changelog entry.

dantegd · 2020-11-16T16:32:48Z

@tylerjthomas9 the changelog is not showing in the diff of github (and therefore CI wasn't triggered). I think what happened is you merged the changelog entry to the patch-2 branch of your cuML fork, while the PR is based on patch-1 (so the changelog entry needs to be in that patch-1 branch)

Added PR #3136 to 0.17 Bug Fixes

dantegd · 2020-11-17T17:33:15Z

rerun tests

tylerjthomas9 requested a review from a team as a code owner November 13, 2020 18:22

JohnZed approved these changes Nov 14, 2020

View reviewed changes

JohnZed changed the title ~~Fix Stochastic Gradient Descent Example~~ [REVIEW] Fix Stochastic Gradient Descent Example Nov 14, 2020

JohnZed added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Nov 14, 2020

Added PR #3136 to 0.17 Bug Fixes

2536cd1

Merge pull request #1 from tylerjthomas9/patch-2

742f071

Added PR #3136 to 0.17 Bug Fixes

cjnolet merged commit 238a8de into rapidsai:branch-0.17 Nov 17, 2020

tylerjthomas9 deleted the patch-1 branch November 18, 2020 03:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Fix Stochastic Gradient Descent Example #3136

[REVIEW] Fix Stochastic Gradient Descent Example #3136

tylerjthomas9 commented Nov 13, 2020 •

edited

Loading

GPUtester commented Nov 13, 2020

GPUtester commented Nov 13, 2020

JohnZed left a comment

JohnZed commented Nov 14, 2020

GPUtester commented Nov 14, 2020

codecov-io commented Nov 14, 2020 •

edited

Loading

dantegd commented Nov 15, 2020

tylerjthomas9 commented Nov 15, 2020

dantegd commented Nov 16, 2020

dantegd commented Nov 17, 2020

[REVIEW] Fix Stochastic Gradient Descent Example #3136

[REVIEW] Fix Stochastic Gradient Descent Example #3136

Conversation

tylerjthomas9 commented Nov 13, 2020 • edited Loading

GPUtester commented Nov 13, 2020

GPUtester commented Nov 13, 2020

JohnZed left a comment

Choose a reason for hiding this comment

JohnZed commented Nov 14, 2020

GPUtester commented Nov 14, 2020

codecov-io commented Nov 14, 2020 • edited Loading

Codecov Report

dantegd commented Nov 15, 2020

tylerjthomas9 commented Nov 15, 2020

dantegd commented Nov 16, 2020

dantegd commented Nov 17, 2020

tylerjthomas9 commented Nov 13, 2020 •

edited

Loading

codecov-io commented Nov 14, 2020 •

edited

Loading