Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Clarify and add fill_value example in arithmetic ops #19675

Merged
merged 9 commits into from
Feb 22, 2018
54 changes: 48 additions & 6 deletions pandas/core/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,10 @@ def _get_frame_op_default_axis(name):
----------
other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill missing (NaN) values with this value. If both Series are
missing, the result will be missing
Fill existing missing (NaN) values, and any new element needed for
successful array alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for Series, so change DataFrame to Series.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe rephrase it as

Fill missing values with 'fill_value'. There are two sources of missing values

    * Missing values present either Series before the operation
    * Newly created missing values as a result of an alignment    

In either case, missing values are not filled if both Series are missing after alignment.
At least one value from either Series must be not NA for 'fill_value' to be used.

Though I'm sure if this is any clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a problem with the first line - "Fill missing values with 'fill_value'". This is basically what bothered me in the first place, it doesn't exactly fill missing value, in the naive way you would expect it to.

If you insist I'll gladly rephrase my current wording. In the mean time I'll fix the rest of your remarks.

the result will be missing
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
Expand All @@ -265,6 +267,18 @@ def _get_frame_op_default_axis(name):
-------
result : Series

Examples
--------
>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show a and b as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to have a line with just >>> a on it before showing the Series. Likewise for b.

>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'c_', 'd'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer just using abde or something, the c_ is confusing

>>> a.add(b, fill_value=0)
a 2.0
b 1.0
c 1.0
c_ 1.0
d NaN
dtype: float64

See also
--------
Series.{reverse}
Expand All @@ -280,8 +294,10 @@ def _get_frame_op_default_axis(name):
axis : {0, 1, 'index', 'columns'}
For Series input, axis to match Series index on
fill_value : None or float value, default None
Fill missing (NaN) values with this value. If both DataFrame locations are
missing, the result will be missing
Fill existing missing (NaN) values, and any new element needed for
successful array alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
Expand All @@ -293,6 +309,18 @@ def _get_frame_op_default_axis(name):
Returns
-------
result : DataFrame

Examples
--------
>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a DataFrame for this example? It can just be a single-column DataFrame with these same values and index.

>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'c_', 'd'])
>>> a.add(b, fill_value=0)
a 2.0
b 1.0
c 1.0
c_ 1.0
d NaN
dtype: float64
"""

_flex_doc_FRAME = """
Expand All @@ -307,8 +335,10 @@ def _get_frame_op_default_axis(name):
axis : {{0, 1, 'index', 'columns'}}
For Series input, axis to match Series index on
fill_value : None or float value, default None
Fill missing (NaN) values with this value. If both DataFrame
locations are missing, the result will be missing
Fill existing missing (NaN) values, and any new element needed for
successful array alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
Expand All @@ -321,6 +351,18 @@ def _get_frame_op_default_axis(name):
-------
result : DataFrame

Examples
--------
>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'c_', 'd'])
>>> a.add(b, fill_value=0)
0
a 2.0
b 1.0
c 1.0
c_ 1.0
d NaN

See also
--------
DataFrame.{reverse}
Expand Down