-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Depr] raise_on_error kwarg with errors kwarg in astype#14878 #14967
[Depr] raise_on_error kwarg with errors kwarg in astype#14878 #14967
Conversation
…v#14761 Updating documentation to reflect change
DataFrame.astype now allows changing the dtype of a column by passing a dict mapping column name to dtype.
DataFrame.astype now allows setting the type of columns by passing a dict mapping column to dtype.
…com/m-charlton/pandas into update_docs_astype_with_dict_#14761
Mistakenly added changes carried out in v0.19 to v0.20
Valid arguments for new 'errors' kwarg are 'ignore' or 'raise' see pandas-dev#14878
@@ -3073,7 +3075,9 @@ def astype(self, dtype, copy=True, raise_on_error=True, **kwargs): | |||
the same type. Alternatively, use {col: dtype, ...}, where col is a | |||
column label and dtype is a numpy.dtype or Python type to cast one | |||
or more of the DataFrame's columns to column-specific types. | |||
raise_on_error : raise on invalid input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave this in here and just say DEPRECATED.
@jorisvandenbossche IIRC that is our convention?
add a versionadded tag 0.20.0 for errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, can leave it (but put it at the end)
raise_on_error : raise on invalid input | ||
errors : {'raise', 'ignore'}, default 'raise' | ||
- ``raise`` : allow exceptions to be raised on invalid input | ||
- ``ignore`` : suppress raising exceptions on invalid input | ||
kwargs : keyword arguments to pass on to the constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need a blank line before kwargs (to make the sub-list work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just checked and the sublist is rendered fine with, or without a line between ignore
and kwargs. I can add an extra line if that is the convention,
|
||
def _astype(self, dtype, copy=False, raise_on_error=True, values=None, | ||
def _astype(self, dtype, copy=False, errors='raise', values=None, | ||
klass=None, mgr=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check the errors in ['raise', 'ignore'] at the beginning of the function and raise a ValueError otherwise (and add a test for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, there are two 'public' astype(...)
methods:
NDFrame.astype(...)
inpandas/core/generic.py
Block.astype(...)
inpandas/core/internals.py
In addition there is a 'protected' Block._astype(...)
method in
pandas/core/internals.py
. Should I only put the checks in the 'public'
methods?
Bearing in mind that the raise_on_error
kwarg is going to be deprecated for
DataFrame.where()
and replaced with the errors
kwarg it would make sense to
put the code that checks the validity of the arguments in one place. Do we have
any existing code where we put such validity checking functions/methods?
I notice that both NDFrame
& Block
inherit from PandasObject
but, I'm not
sure that this is the correct thing to do. Should I put a function in
pandas/core/base.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block is completely internal
u can put the check there
we have centralized checks but no need in this case
out in _astype as that's where it's actually used
parameter validation is best down where it's actually
Current coverage is 84.75% (diff: 100%)@@ master #14967 diff @@
==========================================
Files 144 145 +1
Lines 51030 51144 +114
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43198 43348 +150
+ Misses 7832 7796 -36
Partials 0 0
|
…Docstrings to astype clarified.
Update after review |
instead | ||
errors : {'raise', 'ignore'}, default 'raise' | ||
- ``raise`` : allow exceptions to be raised on invalid input | ||
- ``ignore`` : suppress raising exceptions on invalid input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe specify what happens of there is an error? (original is returned)
|
||
if errors not in errors_legal_values: | ||
invalid_arg = "Expected value of kwarg 'errors' to be one of %s. "\ | ||
"Supplied value is '%s'" % (', '.join("'%s'" % arg for arg in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style comment: can you do the string continuation with putting ( )
around it instead of the \
. And can you use ´.format(..)´ instead of %
(can be on a separate line)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I'll change the output message to:
Expected value of kwarg 'errors' to be one of ['raise', 'ignore']. Supplied value 'True'
This will do away with the need for the messy formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@m-charlton can you change this to the {} formatting syntax?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@@ -566,6 +566,13 @@ def test_astype(self): | |||
else: | |||
self.assertEqual(tmgr.get('d').dtype.type, t) | |||
|
|||
def test_illegal_arg_for_errors_in_astype(self): | |||
""" ValueError exception raised when illegal value used for errors """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you turn this into a normal comment (with #
)? Otherwise the test name does not show up in the nosetests output.
You can also add the issue number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also move the test to the high-level astype
tests for frame/series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked at the existing test cases in pandas/tests/series
&
pandas/tests/frame
and can't find a natural place to put this test.
I was thinking of adding a new file called say test_astype.py
in
pandas/tests/frame
, as there is an existing test_apply.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no just grep for astype and you will see lots of tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this test is fine here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@m-charlton astype tests are located in pandas/tests/frame/test_dtypes.py (you actually updated some tests there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small error message formatting change. ping when pushed and green.
@m-charlton Can you also add a test confirming that the old keyword still works but gives a deprecation warning? You can do this with:
|
xref #14877 we should try to clarify that the |
Back from Christmas/New Year break. Will makes changes today and push for review |
Tests added for deprected 'raise_on_error' kwarg & new 'errors' kwarg. Clarified docstring for DataFrame.astype method
@@ -523,6 +523,24 @@ def test_timedeltas(self): | |||
result = df.get_dtype_counts().sort_values() | |||
assert_series_equal(result, expected) | |||
|
|||
def test_illegal_arg_for_errors_in_astype(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can make this a single test.
as the same test in tests/series/test_dtypes.py as well
minor test change. ping on green. |
Unit tests for DataFrame.astype merged. Dupliacted those tests for Series.astype. Both testing deprecation of 'raise_on_error' kwarg.
@m-charlton Thanks a lot! |
git diff upstream/master | flake8 --diff
Please check that the entry in
whatsnew/v0.20.0.txt
. Unsure that the update was in the_whatsnew_0200.deprecations
or_whatsnew_0200.prior_deprecations
so put it in theformer.