[DOC] Convert all remaining Python docstrings to pydoc and examples to doctest #2415

cjnolet · 2020-06-12T23:28:41Z

There are several places in the codebase currently that are not using the proper docstrings format [1].

It would be worth scraping through the codebase and updating these.

[1] https://numpydoc.readthedocs.io/en/latest/format.html

yuqli · 2020-08-13T22:08:48Z

This seems something I can handle. If nobody else is working on it, can I claim this issue?

dantegd · 2020-08-14T15:56:38Z

@yuqli I think some of these might have been fixed by #2649 and others will soon be automatically handled by a decorator #2635 (automatically generating docstrings for fit/predict/transform/etc methods). Though that decorator doesn't touch the cuml.dask docstrings, so overall checking the status around using 2635 as a base might be a very good idea and very much welcomed!

mdemoret-nv · 2020-08-19T18:42:49Z

It would be great to update some of the python docstrings to use the doctest format in this issue. Right now most of our examples look like:

Examples
--------
.. code-block:: python
    import cupy as cp
    from cuml.metrics import pairwise_distances

    X = cp.array([[2.0, 3.0], [3.0, 5.0], [5.0, 8.0]])
    Y = cp.array([[1.0, 0.0], [2.0, 1.0]])

    # Euclidean Pairwise Distance, Single Input:
    pairwise_distances(X, metric='euclidean')

    # Cosine Pairwise Distance, Multi-Input:
    pairwise_distances(X, Y, metric='cosine')

    # Manhattan Pairwise Distance, Multi-Input:
    pairwise_distances(X, Y, metric='manhattan')

Output:

.. code-block:: python

    array([[0.        , 2.23606798, 5.83095189],
        [2.23606798, 0.        , 3.60555128],
        [5.83095189, 3.60555128, 0.        ]])

    array([[0.4452998 , 0.13175686],
        [0.48550424, 0.15633851],
        [0.47000106, 0.14671817]])

    array([[ 4.,  2.],
        [ 7.,  5.],
        [12., 10.]])

Instead, the doctest format would be (generated by literally copying the example section into a python interactive session):

Examples
--------

>>> import cupy as cp
>>> from cuml.metrics import pairwise_distances
>>>
>>> X = cp.array([[2.0, 3.0], [3.0, 5.0], [5.0, 8.0]])
>>> Y = cp.array([[1.0, 0.0], [2.0, 1.0]])
>>>
>>> # Euclidean Pairwise Distance, Single Input:
>>> pairwise_distances(X, metric='euclidean')
array([[0.        , 2.23606798, 5.83095189],
    [2.23606798, 0.        , 3.60555128],
    [5.83095189, 3.60555128, 0.        ]])
>>>
>>> # Cosine Pairwise Distance, Multi-Input:
>>> pairwise_distances(X, Y, metric='cosine')
array([[0.4452998 , 0.13175686],
    [0.48550424, 0.15633851],
    [0.47000106, 0.14671817]])
>>>
>>> # Manhattan Pairwise Distance, Multi-Input:
>>> pairwise_distances(X, Y, metric='manhattan')
array([[ 4.,  2.],
    [ 7.,  5.],
    [12., 10.]])

Which doesnt look better in Github markdown, but is a big improvement in Sphinx. This allows the user to see the output inline with the code that generated it and the >>> and output can be removed for easy Copy/Paste using the toggle button. See one of Scikit-Learns examples here: https://scikit-learn.org/stable/modules/metrics.html#metrics

yuqli · 2020-08-19T18:45:29Z

Thanks for the reply and the instruction. Sure I will take care of them. Thanks.

yuqli · 2020-10-14T06:46:29Z

Sorry for the delay. I have converted the "Example" section to doctest for some modules. Just wondering how big should a pull request be? Should I submit a PR after finishing all the changes, or should I submit a PR for say every ~200 lines of code change?

Thanks!

github-actions · 2021-02-17T21:17:30Z

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions · 2021-02-17T21:17:32Z

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

beckernick · 2022-02-23T13:59:26Z

We should revive this issue. cuGraph has an open issue for this and cuDF just did this and found quite a few issues.

It will fix broken documentation examples like the following, which will definitely fail as written (gpu_model is an invalid parameter): https://docs.rapids.ai/api/cuml/nightly/api.html?highlight=kernelexplainer#cuml.explainer.KernelExplainer

from cuml import SVR
from cuml import make_regression
from cuml import train_test_split

from cuml.explainer import KernelExplainer

X, y = make_regression(
    n_samples=102,
    n_features=10,
    noise=0.1,
    random_state=42)

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=2,
    random_state=42)

model = SVR().fit(X_train, y_train)

cu_explainer = KernelExplainer(
    model=model.predict,
    data=X_train,
    gpu_model=True)

cu_shap_values = cu_explainer.shap_values(X_test)
cu_shap_values

dantegd · 2022-03-07T16:11:32Z

Linking PR #4618

cjnolet added ? - Needs Triage Need team to review and classify doc Documentation labels Jun 12, 2020

dantegd added good first issue Good for newcomers and removed ? - Needs Triage Need team to review and classify labels Jun 13, 2020

mdemoret-nv mentioned this issue Aug 26, 2020

[REVIEW] Improve Documentation Examples and Source Linking #2541

Merged

dantegd changed the title ~~[DOC] Convert all remaining Python docstrings to pydoc~~ [DOC] Convert all remaining Python docstrings to pydoc and examples to doctest Aug 27, 2020

yuqli mentioned this issue Oct 14, 2020

[REVIEW] Change docstring to doctest format #2975

Closed

mdemoret-nv mentioned this issue Dec 2, 2020

[REVIEW] Multiclass meta estimator wrappers and multiclass SVC #3092

Merged

github-actions bot added the rotten label Feb 17, 2021

github-actions bot added the stale label Feb 17, 2021

mike-wendt removed stale labels Feb 22, 2021

dantegd assigned lowener Mar 7, 2022

lowener closed this as completed Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Convert all remaining Python docstrings to pydoc and examples to doctest #2415

[DOC] Convert all remaining Python docstrings to pydoc and examples to doctest #2415

cjnolet commented Jun 12, 2020

yuqli commented Aug 13, 2020

dantegd commented Aug 14, 2020

mdemoret-nv commented Aug 19, 2020

yuqli commented Aug 19, 2020

yuqli commented Oct 14, 2020

github-actions bot commented Feb 17, 2021

github-actions bot commented Feb 17, 2021

beckernick commented Feb 23, 2022

dantegd commented Mar 7, 2022

[DOC] Convert all remaining Python docstrings to pydoc and examples to doctest #2415

[DOC] Convert all remaining Python docstrings to pydoc and examples to doctest #2415

Comments

cjnolet commented Jun 12, 2020

yuqli commented Aug 13, 2020

dantegd commented Aug 14, 2020

mdemoret-nv commented Aug 19, 2020

yuqli commented Aug 19, 2020

yuqli commented Oct 14, 2020

github-actions bot commented Feb 17, 2021

github-actions bot commented Feb 17, 2021

beckernick commented Feb 23, 2022

dantegd commented Mar 7, 2022