-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: Implement BaseMaskedArray class for integer/boolean ExtensionArrays #30789
REF: Implement BaseMaskedArray class for integer/boolean ExtensionArrays #30789
Conversation
Thanks for starting this. Just out of curiosity did you ever consider implementing this as a Mixin? I suppose the advantage of that versus something like this would be it theoretically allows EA authors to compose masked extension types without having to worry about class hierarchies |
For now, this is not supposed to be public (actually, I just didn't consider that yet, that can certainly be a discussion). For internal use, it doesn't really matter much? What would be the advantage of a mixin in that case? |
For me (being less exposed to the EA stuff) a mixin just seemed a more natural way to add a mask. Let's see what others think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far. Do you think you'll be able to share any of array_ufunc, setitem, and ops?
For now, I think just straight inheritance from EA is best. With a mixin, we'd need to include a bunch of the EA interface anyway to get this to work (dtype, isna, etc.) |
def take(self, indexer, allow_fill=False, fill_value=None): | ||
# we always fill with 1 internally | ||
# to avoid upcasting | ||
data_fill_value = self._internal_fill_value if isna(fill_value) else fill_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could do _validate_fill_value like DTA/TDA/PA to avoid allowing e.g. NaT here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice @jorisvandenbossche can you comment on what is not just a straight cut/paste
Everything is still a straight cut and paste, except for:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. small comments
import pandas.core.common as com | ||
from pandas.core.indexers import check_bool_array_indexer | ||
|
||
from .masked import BaseMaskedArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use absolute imports
from pandas.core.ops import invalid_comparison | ||
from pandas.core.ops.common import unpack_zerodim_and_defer | ||
from pandas.core.tools.numeric import to_numeric | ||
|
||
from .masked import BaseMaskedArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
return cls(data, mask) | ||
|
||
def take(self, indexer, allow_fill=False, fill_value=None): | ||
# we always fill with 1 internally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would remove / modify these comments
@jorisvandenbossche were you planning to move any more methods in this PR, or should we merge this and do more as followups? Something like |
this ready? |
thanks @jorisvandenbossche |
Todo item of #29556, consolidating common code for IntegerArray and BooleanArray.
This is only a start, there is more to share.