Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix Stochastic Gradient Descent Example #3136

Merged
merged 3 commits into from
Nov 17, 2020
Merged

[REVIEW] Fix Stochastic Gradient Descent Example #3136

merged 3 commits into from
Nov 17, 2020

Conversation

tylerjthomas9
Copy link
Contributor

@tylerjthomas9 tylerjthomas9 commented Nov 13, 2020

The example that is currently in the docs does not run. dtype, penalty, lrate, loss are not defined. This new version sets the default values for the parameters of cumlSGD, and copies Mini Batch SGD Regression's dtype for pred_data['col1'], pred_data['col2']. When running this example, I also got slightly different values for the output, so these were also updated.

Old example:

import numpy as np
import cudf
from cuml.solvers import SGD as cumlSGD
X = cudf.DataFrame()
X['col1'] = np.array([1,1,2,2], dtype = np.float32)
X['col2'] = np.array([1,2,2,3], dtype = np.float32)
y = cudf.Series(np.array([1, 1, 2, 2], dtype=np.float32))
pred_data = cudf.DataFrame()
pred_data['col1'] = np.asarray([3, 2], dtype=dtype)
pred_data['col2'] = np.asarray([5, 5], dtype=dtype)
cu_sgd = cumlSGD(learning_rate=lrate, eta0=0.005, epochs=2000,
                fit_intercept=True, batch_size=2,
                tol=0.0, penalty=penalty, loss=loss)
cu_sgd.fit(X, y)
cu_pred = cu_sgd.predict(pred_data).to_array()
print(" cuML intercept : ", cu_sgd.intercept_)
print(" cuML coef : ", cu_sgd.coef_)
print("cuML predictions : ", cu_pred)

New example:

import numpy as np
import cudf
from cuml.solvers import SGD as cumlSGD
X = cudf.DataFrame()
X['col1'] = np.array([1,1,2,2], dtype=np.float32)
X['col2'] = np.array([1,2,2,3], dtype=np.float32)
y = cudf.Series(np.array([1, 1, 2, 2], dtype=np.float32))
pred_data = cudf.DataFrame()
pred_data['col1'] = np.asarray([3, 2], dtype=np.float32)
pred_data['col2'] = np.asarray([5, 5], dtype=np.float32)
cu_sgd = cumlSGD(learning_rate='constant', eta0=0.005, epochs=2000,
                fit_intercept=True, batch_size=2,
                tol=0.0, penalty='none', loss='squared_loss')
cu_sgd.fit(X, y)
cu_pred = cu_sgd.predict(pred_data).to_array()
print(" cuML intercept : ", cu_sgd.intercept_)
print(" cuML coef : ", cu_sgd.coef_)
print("cuML predictions : ", cu_pred)

The example that is currently in the docs does not run. dtype, penalty, lrate, loss are not defined. This new version sets the default values for the parameters of cumlSGD, and copies Mini Batch SGD Regression's dtype for pred_data['col1'], pred_data['col2']. When running this example, I also got slightly different values for the output, so these were also updated.
@tylerjthomas9 tylerjthomas9 requested a review from a team as a code owner November 13, 2020 18:22
@GPUtester
Copy link
Contributor

Can one of the admins verify this patch?

1 similar comment
@GPUtester
Copy link
Contributor

Can one of the admins verify this patch?

Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you for the fix!

@JohnZed
Copy link
Contributor

JohnZed commented Nov 14, 2020

ok to test

@JohnZed JohnZed changed the title Fix Stochastic Gradient Descent Example [REVIEW] Fix Stochastic Gradient Descent Example Nov 14, 2020
@JohnZed JohnZed added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Nov 14, 2020
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@codecov-io
Copy link

codecov-io commented Nov 14, 2020

Codecov Report

Merging #3136 (742f071) into branch-0.17 (77da916) will decrease coverage by 0.39%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.17    #3136      +/-   ##
===============================================
- Coverage        70.57%   70.17%   -0.40%     
===============================================
  Files              197      197              
  Lines            15461    15670     +209     
===============================================
+ Hits             10911    10997      +86     
- Misses            4550     4673     +123     
Impacted Files Coverage Δ
python/cuml/solvers/sgd.pyx 92.93% <ø> (ø)
...on/cuml/_thirdparty/sklearn/preprocessing/_data.py 58.48% <0.00%> (-4.97%) ⬇️
python/cuml/thirdparty_adapters/adapters.py 88.88% <0.00%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77da916...742f071. Read the comment docs.

@dantegd
Copy link
Member

dantegd commented Nov 15, 2020

@tylerjthomas9 thanks for the PR! The only thing missing is an entry in the changelog so that CI runs and we can merge it. Thanks!

@tylerjthomas9
Copy link
Contributor Author

@tylerjthomas9 thanks for the PR! The only thing missing is an entry in the changelog so that CI runs and we can merge it. Thanks!

Oops, completely forgot about that. Just added the changelog entry.

@dantegd
Copy link
Member

dantegd commented Nov 16, 2020

@tylerjthomas9 the changelog is not showing in the diff of github (and therefore CI wasn't triggered). I think what happened is you merged the changelog entry to the patch-2 branch of your cuML fork, while the PR is based on patch-1 (so the changelog entry needs to be in that patch-1 branch)

@dantegd
Copy link
Member

dantegd commented Nov 17, 2020

rerun tests

@cjnolet cjnolet merged commit 238a8de into rapidsai:branch-0.17 Nov 17, 2020
@tylerjthomas9 tylerjthomas9 deleted the patch-1 branch November 18, 2020 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants