-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove _source_index from MultiIndex #9191
Remove _source_index from MultiIndex #9191
Conversation
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9191 +/- ##
===============================================
Coverage ? 10.84%
===============================================
Files ? 115
Lines ? 18761
Branches ? 0
===============================================
Hits ? 2035
Misses ? 16726
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job refactoring the dreaded MultiIndex, I know all about it. :D Looks pretty much good, I'm going to double check a few tests I thought of that might have issues, then approve assuming nothing comes up.
rerun tests |
@gpucibot merge |
rerun tests |
This PR is a follow-up to #9191 to further simplify the MultiIndex class. It removes various unused functions, inlines and simplifies a number of other single-use internal functions, and moves more methods to Frame that can be shared. It also makes numerous other miscellaneous improvements to the code in MultiIndex to simplify further rewrites. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Marlene (https://github.com/marlenezw) - H. Thomson Comer (https://github.com/thomcom) - GALI PREM SAGAR (https://github.com/galipremsagar) - Ashwin Srinath (https://github.com/shwina) URL: #9243
This PR is the first in a sequence refactoring the
MultiIndex
class, which is currently in a halfway state of making use of some facilities common to other Frames without using them consistently. This PR removes the internal_source_data
parameter, which creates a thinDataFrame
wrapper around aMultiIndex
to provide access to those methods. SinceMultiIndex
is now clearly defined as aFrame
, this PR moves us towards a much simpler and more uniform design where we use inheritedFrame
methods directly. This approach is significantly easier to maintain than treating aMultiIndex
as being "backed" by aDataFrame
. This approach also comes with various performance benefits. While relatively cheap, the internal conversion of aMultiIndex
to aDataFrame
is not free, and in cases where we need to return a modifiedMultiIndex
the conversion back from aDataFrame
is significantly more complex. Moreover, in a number of cases we were using less efficient methods than those we could use directly on theMultiIndex
.