-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] [python-package] tell mypy 'auto' has a special meaning for feature_name and categorical_feature #5874
Conversation
…ure_name and categorical_feature
python-package/lightgbm/basic.py
Outdated
else: # use cat cols specified by user | ||
categorical_feature = list(categorical_feature) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this removed? I think this is meant to create a copy and avoid errors derived from this value being changed outside of lightgbm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is meant to create a copy and avoid errors derived from this value being changed outside of
lightgbm
Looking back through the blame, this line was introduced in #2121. I don't see any PR review discussion about it there or in the comment linked to from there... I assumed this list()
was just thrown in to be extra-sure that from this point forward, the value was a list
.
If we want protection against things outside of LightGBM having side effects on this value or vice versa (via the fact that a list is mutable and passed by reference), I think it'd be clearer to handle that with copy.deepcopy()
explicitly, somewhere further upstream in the code.
Would you like me to try that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using list(collection)
is a common way to ensure that:
- From this point forward the collection is a list.
- The variable now holds a copy of the original collection.
So if the input was already a list calling list()
on it only creates a copy and if was some other type of iterable it achieves both.
Since the input is expected to be a list of strings deepcopy might be overkill and won't achieve 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll put this back for now. I really think it's both unclear that the point is to create a copy and unnecessary for hte purpose "ensure that it's a list", given that by this point in the code we've already checked that categorical_feature
is,
not None
:
LightGBM/python-package/lightgbm/basic.py
Line 685 in 3d63dda
if categorical_feature is not None: |
and not the string "auto"
LightGBM/python-package/lightgbm/basic.py
Lines 688 to 689 in 3d63dda
if categorical_feature == 'auto': # use cat cols from DataFrame | |
categorical_feature = cat_cols_not_ordered |
.. so the only other things that attribute permits are List[str]
or List[int]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this back in 51ab03a and ceef1f6
It's necessary to add a # type: ignore
comment if we're going to keep that code branch, to avoid this from mypy
.
python-package/lightgbm/basic.py:694: error: Incompatible types in assignment (expression has type "List[object]", variable has type "Union[List[str], List[int], Literal['auto'], None]") [assignment]
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
Contributes to #3756.
Contributes to #3867.
Fixes the following error from
mypy
.For arguments
feature_name
andcategorical_feature
, the Python package supports passing 3 possible non-None
values:"auto"
(interpreted as "figure it out automatically" forpandas
inputs, and asNone
otherwise)Using the
typing.Literal
annotation, added in Python 3.8,mypy
is able to understand that if these arguments are notNone
or a list, they must be literally"auto"
, not any arbitrary string (as the current type hint ofstr
implies).This helps it to understand that, for example...
Notes for Reviewers
For more details on the
Literal
type, see https://mypy.readthedocs.io/en/stable/literal_types.html#literal-types.For an explanation of how string-literal type hints can be used to prevent runtime users, see https://mypy.readthedocs.io/en/stable/runtime_troubles.html.