Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS: Updated NDFrame.astype docs #17203

Merged
merged 5 commits into from
Aug 9, 2017
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 47 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3610,8 +3610,7 @@ def blocks(self):
mapping={True: 'raise', False: 'ignore'})
def astype(self, dtype, copy=True, errors='raise', **kwargs):
"""
Cast object to input numpy.dtype
Return a copy when copy = True (be really careful with this!)
Cast a pandas object to new dtype ``dtype``.

Parameters
----------
Expand All @@ -3620,6 +3619,8 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):
the same type. Alternatively, use {col: dtype, ...}, where col is a
column label and dtype is a numpy.dtype or Python type to cast one
or more of the DataFrame's columns to column-specific types.
copy : bool, default True.
Return a copy when ``copy=True`` (be really careful with this!).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we are at this, I would like to see this explanation improved ("be really careful with this!" is not that helpful ..).
But also ok to leave that for other issue/PR if you don't feel comfortable with it

(I don't know the internals well enough to know exactly how the keyword is used, but for numerical dtypes I suppose it follows the numpy astype behaviour? Which means only not copying if dtype (and order) is equivalent)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to be a bit more informative.

errors : {'raise', 'ignore'}, default 'raise'.
Control raising of exceptions on invalid data for provided dtype.

Expand All @@ -3636,6 +3637,50 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):
Returns
-------
casted : type of caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add a See Also to numpy.astype here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's provide an example using the copy argument given that it says the parameter should be handled with care.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done, added example with copy=False, where result propagates upwards.

I could only get it to work with categoricals and not numpy dtypes, so the example is a bit contrieved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy is not really that useful here, but ok since that you did it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, especially if copy=False has no effect with numpy.dtypes.

Unless someone can find an effect with columns with numpy.dtypes, I wouldn't mind pulling this out again, as my example is maybe a bit silly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it works as well with numpy dtypes:

In [77]: s1 = pd.Series([1,2])

In [78]: s2 = s1.astype('int', copy=False)

In [79]: s2[0] = 10

In [80]: s1
Out[80]: 
0    10
1     2
dtype: int64

It's just that the dtype needs to be equivalent (otherwise it always takes a copy).

So I would change the example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've changed it to your example.

--------
>>> ser = pd.Series([1, 2], dtype='int32')
>>> ser
0 1
1 2
dtype: int32
>>> ser.astype('int64')
0 1
1 2
dtype: int64

Convert to pd.Categorial:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write something like "convert to categorical type" or "categorical Series" (as it does not return a pd.Categorical object, but a Series with categorical dtype)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.


>>> ser.astype('category')
0 1
1 2
dtype: category
Categories (2, int64): [1, 2]

Convert to ordered pd.Categorial with custom ordering:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here as well: "convert to ordered categorical with .."


>>> ser.astype('category', ordered=True, categories=[2, 1])
0 1
1 2
dtype: category
Categories (2, int64): [2 < 1]

Note that using ``copy=False`` and changing data on a new
pandas object may propagate changes upwards:

>>> cat1 = pd.Series([1,2], dtype='category')
>>> cat2 = cat1.astype('category', copy=False)
>>> cat2[0] = 2
>>> cat1 # note that cat1[0] is changed too
0 2
1 2
dtype: category
Categories (2, int64): [1, 2]

See also
--------
numpy.ndarray.astype : Cast a numpy array to a specified type.
"""
if is_dict_like(dtype):
if self.ndim == 1: # i.e. Series
Expand Down