Skip to content

Commit

Permalink
Revert "Fix sphinx doc warnings"
Browse files Browse the repository at this point in the history
This reverts commit 80b34af.
  • Loading branch information
RAMitchell committed Feb 18, 2022
1 parent 1477bc0 commit 29e10bf
Show file tree
Hide file tree
Showing 12 changed files with 21 additions and 37 deletions.
25 changes: 11 additions & 14 deletions python/cuml/cluster/agglomerative.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -106,24 +106,21 @@ class AgglomerativeClustering(Base, ClusterMixin, CMajorInputTagMixin):
Which linkage criterion to use. The linkage criterion determines
which distance to use between sets of observations. The algorithm
will merge the pairs of clusters that minimize this criterion.
* 'single' uses the minimum of the distances between all
observations of the two sets.
- 'single' uses the minimum of the distances between all
observations of the two sets.
n_neighbors : int (default = 15)
The number of neighbors to compute when connectivity = "knn"
connectivity : {"pairwise", "knn"}, (default = "knn")
The type of connectivity matrix to compute.
* 'pairwise' will compute the entire fully-connected graph of
pairwise distances between each set of points. This is the
fastest to compute and can be very fast for smaller datasets
but requires O(n^2) space.
* 'knn' will sparsify the fully-connected connectivity matrix to
save memory and enable much larger inputs. "n_neighbors" will
control the amount of memory used and the graph will be connected
automatically in the event "n_neighbors" was not large enough
to connect it.
- 'pairwise' will compute the entire fully-connected graph of
pairwise distances between each set of points. This is the
fastest to compute and can be very fast for smaller datasets
but requires O(n^2) space.
- 'knn' will sparsify the fully-connected connectivity matrix to
save memory and enable much larger inputs. "n_neighbors" will
control the amount of memory used and the graph will be connected
automatically in the event "n_neighbors" was not large enough
to connect it.
output_type : {'input', 'cudf', 'cupy', 'numpy', 'numba'}, default=None
Variable to control output type of the results and attributes of
the estimator. If None, it'll inherit the output type set at the
Expand Down
10 changes: 5 additions & 5 deletions python/cuml/cluster/hdbscan.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,7 @@ class HDBSCAN(Base, ClusterMixin, CMajorInputTagMixin):
alpha : float, optional (default=1.0)
A distance scaling parameter as used in robust single linkage.
See [2]_ for more information.
verbose : int or boolean, default=False
Sets logging level. It must be one of `cuml.common.logger.level_*`.
Expand All @@ -308,7 +309,7 @@ class HDBSCAN(Base, ClusterMixin, CMajorInputTagMixin):
cluster_selection_epsilon : float, optional (default=0.0)
A distance threshold. Clusters below this value will be merged.
Note that this should not be used
See [3]_ for more information. Note that this should not be used
if we want to predict the cluster labels for new points in future
(e.g. using approximate_predict), as the approximate_predict function
is not aware of this argument.
Expand Down Expand Up @@ -339,7 +340,6 @@ class HDBSCAN(Base, ClusterMixin, CMajorInputTagMixin):
to find the most persistent clusters. Alternatively you can instead
select the clusters at the leaves of the tree -- this provides the
most fine grained and homogeneous clusters. Options are:
* ``eom``
* ``leaf``
Expand All @@ -349,17 +349,17 @@ class HDBSCAN(Base, ClusterMixin, CMajorInputTagMixin):
the case that you feel this is a valid result for your dataset.
gen_min_span_tree : bool, optional (default=False)
Whether to populate the `minimum_spanning_tree_` member for
Whether to populate the minimum_spanning_tree_ member for
utilizing plotting tools. This requires the `hdbscan` CPU Python
package to be installed.
gen_condensed_tree : bool, optional (default=False)
Whether to populate the `condensed_tree_` member for
Whether to populate the condensed_tree_ member for
utilizing plotting tools. This requires the `hdbscan` CPU
Python package to be installed.
gen_single_linkage_tree_ : bool, optinal (default=False)
Whether to populate the `single_linkage_tree_` member for
Whether to populate the single_linkage_tree_ member for
utilizing plotting tools. This requires the `hdbscan` CPU
Python package t be installed.
Expand Down
6 changes: 2 additions & 4 deletions python/cuml/dask/cluster/kmeans.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,14 +141,12 @@ def fit(self, X, sample_weight=None):
X : Dask cuDF DataFrame or CuPy backed Dask Array
Training data to cluster.
sample_weight : Dask cuDF DataFrame or CuPy backed Dask Array \
shape = (n_samples,), default=None # noqa
sample_weight : Dask cuDF DataFrame or CuPy backed Dask Array
shape = (n_samples,), default=None # noqa
The weights for each observation in X. If None, all observations
are assigned equal weight.
Acceptable formats: cuDF DataFrame, NumPy ndarray, Numba device
ndarray, cuda array interface compliant array like CuPy
"""

sample_weight = self._check_normalize_sample_weight(sample_weight)
Expand Down
4 changes: 0 additions & 4 deletions python/cuml/dask/ensemble/randomforestclassifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,6 @@ class RandomForestClassifier(BaseRandomForestModel, DelayedPredictionMixin,
* ``4`` or ``'poisson'`` for poisson half deviance
* ``5`` or ``'gamma'`` for gamma half deviance
* ``6`` or ``'inverse_gaussian'`` for inverse gaussian deviance
``2``, ``'mse'``, ``4``, ``'poisson'``, ``5``, ``'gamma'``, ``6``,
``'inverse_gaussian'`` not valid for classification
bootstrap : boolean (default = True)
Expand All @@ -106,7 +105,6 @@ class RandomForestClassifier(BaseRandomForestModel, DelayedPredictionMixin,
* If ``'sqrt'`` then ``max_features=1/sqrt(n_features)``.
* If ``'log2'`` then ``max_features=log2(n_features)/n_features``.
* If ``None``, then ``max_features = 1.0``.
n_bins : int (default = 128)
Maximum number of bins used by the split algorithm per feature.
min_samples_leaf : int or float (default = 1)
Expand All @@ -116,7 +114,6 @@ class RandomForestClassifier(BaseRandomForestModel, DelayedPredictionMixin,
* If ``float``, then ``min_samples_leaf`` represents a fraction
and ``ceil(min_samples_leaf * n_rows)`` is the minimum number of
samples for each leaf node.
min_samples_split : int or float (default = 2)
The minimum number of samples required to split an internal
node.\n
Expand All @@ -125,7 +122,6 @@ class RandomForestClassifier(BaseRandomForestModel, DelayedPredictionMixin,
* If type ``float``, then ``min_samples_split`` represents a fraction
and ``ceil(min_samples_split * n_rows)`` is the minimum number of
samples for each split.
n_streams : int (default = 4 )
Number of parallel streams used for forest building
workers : optional, list of strings
Expand Down
1 change: 0 additions & 1 deletion python/cuml/dask/ensemble/randomforestregressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,6 @@ class RandomForestRegressor(BaseRandomForestModel, DelayedPredictionMixin,
* ``4`` or ``'poisson'`` for poisson half deviance
* ``5`` or ``'gamma'`` for gamma half deviance
* ``6`` or ``'inverse_gaussian'`` for inverse gaussian deviance
``0``, ``'gini'``, ``1``, ``'entropy'`` not valid for regression
bootstrap : boolean (default = True)
Control bootstrapping.\n
Expand Down
1 change: 0 additions & 1 deletion python/cuml/ensemble/randomforestclassifier.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,6 @@ class RandomForestClassifier(BaseRandomForestModel,
* ``4`` or ``'poisson'`` for poisson half deviance
* ``5`` or ``'gamma'`` for gamma half deviance
* ``6`` or ``'inverse_gaussian'`` for inverse gaussian deviance
only ``0``/``'gini'`` and ``1``/``'entropy'`` valid for classification
bootstrap : boolean (default = True)
Control bootstrapping.\n
Expand Down
1 change: 0 additions & 1 deletion python/cuml/ensemble/randomforestregressor.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,6 @@ class RandomForestRegressor(BaseRandomForestModel,
* ``4`` or ``'poisson'`` for poisson half deviance
* ``5`` or ``'gamma'`` for gamma half deviance
* ``6`` or ``'inverse_gaussian'`` for inverse gaussian deviance
``0``, ``'gini'``, ``1`` and ``'entropy'`` not valid for regression.
bootstrap : boolean (default = True)
Control bootstrapping.\n
Expand Down
1 change: 0 additions & 1 deletion python/cuml/feature_extraction/_tfidf_vectorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,6 @@ def transform(self, raw_documents):
def get_feature_names(self):
"""
Array mapping from feature integer indices to feature name.
Returns
-------
feature_names : Series
Expand Down
2 changes: 2 additions & 0 deletions python/cuml/fil/fil.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -578,6 +578,7 @@ class ForestInference(Base,

Parameters
----------
{}
preds : gpuarray or cudf.Series, shape = (n_samples,)
Optional 'out' location to store inference results

Expand Down Expand Up @@ -606,6 +607,7 @@ class ForestInference(Base,

Parameters
----------
{}
preds : gpuarray or cudf.Series, shape = (n_samples,2)
Binary probability output
Optional 'out' location to store inference results
Expand Down
1 change: 0 additions & 1 deletion python/cuml/metrics/pairwise_distances.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,6 @@ def sparse_pairwise_distances(X, Y=None, metric="euclidean", handle=None,
See the documentation for scipy.spatial.distance for details on these
metrics.
- ['inner_product', 'hellinger']
Parameters
----------
X : array-like (device or host) of shape (n_samples_x, n_features)
Expand Down
4 changes: 0 additions & 4 deletions python/cuml/naive_bayes/naive_bayes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1378,7 +1378,6 @@ def _check_X(self, X):

def fit(self, X, y, sample_weight=None) -> "CategoricalNB":
"""Fit Naive Bayes classifier according to X, y
Parameters
----------
X : array-like of shape (n_samples, n_features)
Expand All @@ -1394,7 +1393,6 @@ def fit(self, X, y, sample_weight=None) -> "CategoricalNB":
sample_weight : array-like of shape (n_samples), default=None
Weights applied to individual samples (1. for unweighted).
Currently sample weight is ignored.
Returns
-------
self : object
Expand All @@ -1412,7 +1410,6 @@ def partial_fit(self, X, y, classes=None,
This method has some performance overhead hence it is better to call
partial_fit on chunks of data that are as large as possible
(as long as fitting in the memory budget) to hide the overhead.
Parameters
----------
X : array-like of shape (n_samples, n_features)
Expand All @@ -1432,7 +1429,6 @@ def partial_fit(self, X, y, classes=None,
sample_weight : array-like of shape (n_samples), default=None
Weights applied to individual samples (1. for unweighted).
Currently sample weight is ignored.
Returns
-------
self : object
Expand Down
2 changes: 1 addition & 1 deletion python/cuml/preprocessing/TargetEncoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class TargetEncoder:
'continuous': consecutive samples are grouped into one folds.
'interleaved': samples are assign to each fold in a round robin way.
'customize': customize splitting by providing a `fold_ids` array
in `fit()` or `fit_transform()` functions.
in `fit()` or `fit_transform()` functions.
output_type: {'cupy', 'numpy', 'auto'}, default = 'auto'
The data type of output. If 'auto', it matches input data.
Expand Down

0 comments on commit 29e10bf

Please sign in to comment.