-
Notifications
You must be signed in to change notification settings - Fork 655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#4725: Make index and columns lazy in Modin DataFrame #4726
Conversation
89056d3
to
9e0f50b
Compare
Codecov Report
@@ Coverage Diff @@
## master #4726 +/- ##
==========================================
+ Coverage 85.26% 89.80% +4.54%
==========================================
Files 259 260 +1
Lines 19215 19521 +306
==========================================
+ Hits 16383 17531 +1148
+ Misses 2832 1990 -842
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
Signed-off-by: Vasily Litvinov <[email protected]>
9e0f50b
to
2c702b0
Compare
Related discussion on handling metadata (index and columns) in #3673. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some minor style comments. Thanks @vnlitvinov !
That issue is talking about improving pivot speed if we can omit computing index and labels... well, after this PR we will be able to! 😄 |
Co-authored-by: Mahesh Vashishtha <[email protected]> Signed-off-by: Vasily Litvinov <[email protected]>
a0fd1e2
to
fda1cdf
Compare
There are some thoughts on handling metadata for |
Co-authored-by: Yaroslav Igoshev <[email protected]>
Signed-off-by: Vasily Litvinov <[email protected]>
ad65ad1
to
8538076
Compare
@vnlitvinov, is there a case for now when we construct |
Not yet, though I have another PR in my queue waiting for this one to be merged to improve Also I'm guessing that some other queries like |
Got it, thanks! Let's resolve the rest of the comments and get this PR merged. |
Co-authored-by: Yaroslav Igoshev <[email protected]> Signed-off-by: Vasily Litvinov <[email protected]>
9efba8d
to
a91992a
Compare
@YarShev I think I've addressed everything now. @modin-project/modin-core is there anything missing? Can we get this merged, so I can submit a |
There are also some CI jobs failed, please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@prutskov another use case for this "lazy index" thing is it could help with ingest like |
@vnlitvinov I was thinking the exact same thing 😄 |
Merging the changes as CI failures do not relate to them. See more in #4745. |
…me (modin-project#4726) Co-authored-by: Mahesh Vashishtha <[email protected]> Co-authored-by: Yaroslav Igoshev <[email protected]> Signed-off-by: Vasily Litvinov <[email protected]>
Signed-off-by: Vasily Litvinov [email protected]
What do these changes do?
Allow not specifying
index
andcolumns
when constructingPandasDataframe
, they would be computed on-demand when accessing.index
and.columns
.flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date