Changes after review

rapidsai · May 4, 2022 · 52fc1bf · 52fc1bf
1 parent f8bc555
commit 52fc1bf
Show file tree

Hide file tree

Showing 4 changed files with 36 additions and 37 deletions.
diff --git a/docs/cudf/source/user_guide/dask-cudf.md b/docs/cudf/source/user_guide/dask-cudf.md
@@ -39,7 +39,7 @@ The following is tested and expected to work:
   - Support for reductions on full dataframes
   - `std`
   - Custom reductions with
-    [dask.dataframe.reduction](http://docs.dask.org/en/latest/generated/dask.dataframe.Series.reduction.html)
+    [dask.dataframe.reduction](https://docs.dask.org/en/latest/generated/dask.dataframe.Series.reduction.html)
 
 - Groupby aggregations
 

diff --git a/docs/cudf/source/user_guide/data-types.md b/docs/cudf/source/user_guide/data-types.md
@@ -5,7 +5,7 @@ numeric, datetime, timedelta, categorical and string data types. We
 also provide special data types for working with decimals, list-like,
 and dictionary-like data.
 
-All data types in cuDF are [nullable](/user_guide/missing-data).
+All data types in cuDF are [nullable](missing-data).
 
 <div class="special-table">
 
@@ -34,10 +34,14 @@ ways to specify the `float32` data type:
 ```python
 >>> import cudf
 >>> s = cudf.Series([1, 2, 3], dtype="float32")
->>> print(s)
+>>> s
+0    1.0
+1    2.0
+2    3.0
+dtype: float32
 ```
 
-## A note on ``object``
+## A note on `object`
 
 The data type associated with string data in cuDF is `"np.object"`.
 
@@ -60,7 +64,8 @@ We provide special data types for working with decimal data, namely
 data types when you need to store values with greater precision than
 allowed by floating-point representation.
 
-A decimal data type is composed of a _precision_ and a _scale_.  The
+Decimal data types in cuDF are based on fixed-point representation.  A
+decimal data type is composed of a _precision_ and a _scale_.  The
 precision represents the total number of digits in each value of this
 dtype. For example, the precision associated with the decimal value
 `1.023` is `4`. The scale is the total number of digits to the right
@@ -72,10 +77,8 @@ Each decimal data type is associated with a maximum precision:
 ```python
 >>> cudf.Decimal32Dtype.MAX_PRECISION
 9.0
-
 >>> cudf.Decimal64Dtype.MAX_PRECISION
 18.0
-
 >>> cudf.Decimal128Dtype.MAX_PRECISION
 38
 ```
@@ -85,24 +88,20 @@ One way to create a decimal Series is from values of type [decimal.Decimal][pyth
 ```python
 >>> from decimal import Decimal
 >>> s = cudf.Series([Decimal("1.01"), Decimal("4.23"), Decimal("0.5")])
-
 >>> s
 0    1.01
 1    4.23
 2    0.50
 dtype: decimal128
-
 >>> s.dtype
->>> Decimal128Dtype(precision=3, scale=2)
+Decimal128Dtype(precision=3, scale=2)
 ```
 
 Notice the data type of the result: `1.01`, `4.23`, `0.50` can all be
-represented with a precision at least equal to 3 and a scale at least
-equal to 2.
+represented with a precision at least 3 and a scale at least 2.
 
-However, the value `1.234` needs a precision at least equal to 4, and
-a scale at least equal to 3, and cannot be fully represented
-using this data type:
+However, the value `1.234` needs a precision at least 4, and a scale
+at least 3, and cannot be fully represented using this data type:
 
 ```python
 >>> s[1] = Decimal("1.234")  # raises an error
@@ -124,7 +123,6 @@ lists and dictionaries respectively:
 0 {'a': 1, 'b': 2}
 1 {'a': 3, 'b': 4}
 dtype: object
-
 >>> gsr = cudf.from_pandas(psr)
 >>> gsr
 0 {'a': 1, 'b': 2}
@@ -140,14 +138,12 @@ nested data](io).
 ```python
 >>> pdf = pd.DataFrame({"a": [[1, 2], [3, 4, 5], [6, 7, 8]]})
 >>> pdf.to_parquet("lists.pq")
-
 >>> gdf = cudf.read_parquet("lists.pq")
 >>> gdf
            a
 0     [1, 2]
 1  [3, 4, 5]
 2  [6, 7, 8]
-
 >>> gdf["a"].dtype
 ListDtype(int64)
 ```

diff --git a/docs/cudf/source/user_guide/groupby.md b/docs/cudf/source/user_guide/groupby.md
@@ -35,18 +35,24 @@ A GroupBy object is created by grouping the values of a `Series` or
 `DataFrame` by one or more columns:
 
 ```python
-import cudf
-
+>>> import cudf
 >>> df = cudf.DataFrame({'a': [1, 1, 1, 2, 2], 'b': [1, 1, 2, 2, 3], 'c': [1, 2, 3, 4, 5]})
 >>> df
+   a  b  c
+0  1  1  1
+1  1  1  2
+2  1  2  3
+3  2  2  4
+4  2  3  5
 >>> gb1 = df.groupby('a')  # grouping by a single column
 >>> gb2 = df.groupby(['a', 'b'])  # grouping by multiple columns
 >>> gb3 = df.groupby(cudf.Series(['a', 'a', 'b', 'b', 'b']))  # grouping by an external column
 ```
 
 ````{warning}
-cuDF uses `sort=False` by default to achieve better performance, which provides no gaurentee to the group order in outputs. This deviates from Pandas default behavior.
-
+Unlike Pandas, cuDF uses `sort=False` by default to achieve better
+performance, which does not guarantee any particular group order in
+the result.
 
 For example:
 
@@ -107,7 +113,7 @@ b
 
 ## Aggregation
 
-Aggregations on groups is supported via the `agg` method:
+Aggregations on groups are supported via the `agg` method:
 
 ```python
 >>> df
@@ -209,7 +215,7 @@ a
 - `apply` works by applying the provided function to each group
   sequentially, and concatenating the results together. **This can be
   very slow**, especially for a large number of small groups. For a
-  small number of large groups, it can give acceptable performance
+  small number of large groups, it can give acceptable performance.
 - The results may not always match Pandas exactly. For example, cuDF
   may return a `DataFrame` containing a single column where Pandas
   returns a `Series`. Some post-processing may be required to match
@@ -218,8 +224,6 @@ a
   supports with `apply`, such as calling [describe] inside the
   callable.
 
->
-
 ## Transform
 
 The `.transform()` method aggregates per group, and broadcasts the

diff --git a/docs/cudf/source/user_guide/index.md b/docs/cudf/source/user_guide/index.md
@@ -3,15 +3,14 @@
 ```{toctree}
 :maxdepth: 2
 
-10min.md
-pandas-comparison.rst
-data-types.rst
-io.rst
-missing-data.md
-groupby.rst
-guide-to-udfs.md
-cupy-interop.md
-dask-cudf.rst
-internals.rst
-PandasCompat.rst
+10min
+data-types
+io
+missing-data
+groupby
+guide-to-udfs
+cupy-interop
+dask-cudf
+internals
+PandasCompat
 ```