Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Validate keyword arguments to fillna #19684

Merged
merged 10 commits into from
Feb 22, 2018

Conversation

TomAugspurger
Copy link
Contributor

Closes #19682

@@ -1607,6 +1607,9 @@ def fillna(self, value=None, method=None, limit=None):
-------
filled : Categorical with NA/NaN filled
"""
value, method = validate_fillna_kwargs(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, I added this keyword since it's possible have a tuple / list for a category.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a test for this? (I mean filling a categorical with such categories, so the purpose of validate_scalar_dict_value=False is exercised)

if value is None and method is None:
raise ValueError("Must specify a fill 'value' or 'method'.")
elif value is None and method is not None:
clean_fill_method(method)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method = clean_fill_method(method) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thought about it. Meant to double check one thing in categorical but forgot. One sec...

Parameters
----------
value, method : object
The 'value' and 'method' keyword arguments for 'fillna'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to validate_scalar_dict_value in the docstring?

@gfyoung gfyoung added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Error Reporting Incorrect or improved errors from pandas labels Feb 14, 2018
@codecov
Copy link

codecov bot commented Feb 14, 2018

Codecov Report

Merging #19684 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19684      +/-   ##
==========================================
+ Coverage   91.61%   91.61%   +<.01%     
==========================================
  Files         150      150              
  Lines       48892    48900       +8     
==========================================
+ Hits        44792    44800       +8     
  Misses       4100     4100
Flag Coverage Δ
#multiple 89.99% <100%> (ø) ⬆️
#single 41.79% <62.5%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/generic.py 95.93% <100%> (-0.01%) ⬇️
pandas/util/_validators.py 96.8% <100%> (+0.46%) ⬆️
pandas/core/arrays/categorical.py 94.91% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa59954...dc1f960. Read the comment docs.

cat = Categorical([1, 2, 3])

xpr = "Cannot specify both 'value' and 'method'."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove the line between the message and the condition, personal preference but its slightly distracting

@TomAugspurger
Copy link
Contributor Author

Merging later today if no objections.

@jreback
Copy link
Contributor

jreback commented Feb 16, 2018

let me look

method = clean_fill_method(method)

elif value is not None and method is None:
if validate_scalar_dict_value and isinstance(value, (list, tuple)):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't seem especially useful / correct given the error message.

There are many things that are not scalars or dicts, but also not lists or tuples, so we would fail to hit this TypeError.

But I was trying to keep this as a straight refactor for NDFrame.fillna, with the only behavior change being Categorical.fillna.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont' think this is odd to want to validate that a scalar can be only a certain type. maybe would change this

validate_scalar=lambda x: is_scalar(x) or is_dict_like(x)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parameter name is terrible, ideally would like to change this

Copy link
Contributor Author

@TomAugspurger TomAugspurger Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. Please don't call a change by a contributor "terrible". It's not helpful.
  2. The error message says "'value' parameter must be a scalar or dict". That seems to match the parameter name pretty well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parameter name is terrible, obviously NOT the contributor, which I didn't say or imply at all. It needs to be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter name is certainly descriptive in what it does, and it is for an internal method, so it's not much of a problem it is that long. Why bother for the rest?

value, method : object
The 'value' and 'method' keyword arguments for 'fillna'.
validate_scalar_dict_value : bool, default True
Whether to validate that 'value' is a scalar or dict; specifically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: replace the semicolon with a comma.

raise ValueError("Must specify a fill 'value' or 'method'.")
elif value is None and method is not None:
method = clean_fill_method(method)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line here (make consistent across this function)

method = clean_fill_method(method)

elif value is not None and method is None:
if validate_scalar_dict_value and isinstance(value, (list, tuple)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont' think this is odd to want to validate that a scalar can be only a certain type. maybe would change this

validate_scalar=lambda x: is_scalar(x) or is_dict_like(x)

@TomAugspurger
Copy link
Contributor Author

I was trying to add a test for cat.fillna(tuple), but apparently, we don't allow it anyway. On master.

In [1]: import pandas as pd

In [2]: pd.Categorical([(1, 2), None]).fillna((1, 2))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-73c5b4ab00c2> in <module>()
----> 1 pd.Categorical([(1, 2), None]).fillna((1, 2))

~/sandbox/pandas-ip/pandas/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    136                 else:
    137                     kwargs[new_arg_name] = new_arg_value
--> 138             return func(*args, **kwargs)
    139         return wrapper
    140     return _deprecate_kwarg

~/sandbox/pandas-ip/pandas/pandas/core/arrays/categorical.py in fillna(self, value, method, limit)
   1664                 raise TypeError('"value" parameter must be a scalar, dict '
   1665                                 'or Series, but you passed a '
-> 1666                                 '"{0}"'.format(type(value).__name__))
   1667
   1668         return self._constructor(values, categories=self.categories,

TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "tuple"

I've left the validate_scalar_dict_value boolean there, as prep for #19705, but can remove from here if you want.

@@ -590,6 +590,8 @@ Other API Changes
object frequency is ``None`` (:issue:`19147`)
- Set operations (union, difference...) on :class:`IntervalIndex` with incompatible index types will now raise a ``TypeError`` rather than a ``ValueError`` (:issue:`19329`)
- :class:`DateOffset` objects render more simply, e.g. "<DateOffset: days=1>" instead of "<DateOffset: kwds={'days': 1}>" (:issue:`19403`)
- :func:`pandas.merge` provides a more informative error message when trying to merge on timezone-aware and timezone-naive columns (:issue:`15800`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a datetimelike section for other api changes now

method = clean_fill_method(method)

elif value is not None and method is None:
if validate_scalar_dict_value and isinstance(value, (list, tuple)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parameter name is terrible, ideally would like to change this

@jreback jreback added this to the 0.23.0 milestone Feb 22, 2018
@jreback jreback merged commit 3b135c3 into pandas-dev:master Feb 22, 2018
@jreback
Copy link
Contributor

jreback commented Feb 22, 2018

thanks @TomAugspurger

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants