-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: deprecate fastpath keyword in Index constructors #23110
Conversation
Hello @jorisvandenbossche! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #23110 +/- ##
==========================================
+ Coverage 92.2% 92.21% +0.01%
==========================================
Files 169 169
Lines 50888 50895 +7
==========================================
+ Hits 46919 46935 +16
+ Misses 3969 3960 -9
Continue to review full report at Codecov.
|
@@ -306,7 +306,7 @@ def __contains__(self, key): | |||
|
|||
@cache_readonly | |||
def _int64index(self): | |||
return Int64Index(self.asi8, name=self.name, fastpath=True) | |||
return Int64Index._simple_new(self.asi8, name=self.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the performance implication of using _simple_new vs just Int64Index(...)? If its sufficiently small, we should be using the public constructors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatively it is a big difference:
In [8]: values = np.arange(100000)
In [9]: %timeit pd.Index(values)
31 µs ± 140 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [10]: %timeit pd.Index._simple_new(values)
1.62 µs ± 26.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
but if that will be significant in actual code, I don't know.
But since we are using that in many places, I leave changing _simple_new
calls to the public constructor for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel this leads a bit further, but given that the public constructors (__init__
) of Index classes are very complex (something we will not be able to change I think), I think that we need to see _simple_new
as kind of a "internal public" method within the pandas code base.
@@ -174,7 +179,7 @@ def _data(self): | |||
|
|||
@cache_readonly | |||
def _int64index(self): | |||
return Int64Index(self._data, name=self.name, fastpath=True) | |||
return Int64Index._simple_new(self._data, name=self.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shallow_copy? ditto below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this case, since it is not self
that is calling it, so you need to pass the self.name
anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One case below is shallow_copy
, but the other can indeed be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And actually, _shallow_copy
has not the ability to specify (override) the name
(which is not really following how _shallow_copy
works for other Index classes), so can also not use it there.
Might be a bit buggy in _shallow_copy
(it is also not used at all anywhere in the RangeIndex implementation, although maybe in inherited methods ..).
But I want to keep it here just about deprecating the fastpath keyword, so let's leave that for another issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are u changing these constructors at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not changing anything. fastpath=True
means exactly the _simple_new
what I replaced it with.
This PR does not try to do a clean-up of usage of _simple_new
by replacing it with the default constructor, it only deprecates the fastpath
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
General +1, couple of comments. Is there a tracker issue for deprecations-to-eventually-remove? |
yes |
Added it to the list in #6581 |
@jbrockmendel Thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer simply to use the actual constructor rather than _simple_new if we can
@@ -174,7 +179,7 @@ def _data(self): | |||
|
|||
@cache_readonly | |||
def _int64index(self): | |||
return Int64Index(self._data, name=self.name, fastpath=True) | |||
return Int64Index._simple_new(self._data, name=self.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are u changing these constructors at all?
@jorisvandenbossche can you update |
What do I need to update? |
comment above |
Answer above: #23110 (comment) |
thanks @jorisvandenbossche |
thanks! |
From all that looking at Index constructors ...
Apparently it was even hardly used internally for the Index classes (except a little bit in RangeIndex).
Part of #20110