-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: groupby
then resample
on column gives incorrect results if the index is out of order
#59408
Open
aram-cinnamon
wants to merge
14
commits into
pandas-dev:main
Choose a base branch
from
aram-cinnamon:groupby-then-resample-if-index-out-of-order
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 10 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
c21c5c9
Index should be ignored when using the on keyword argument to resample
aram-cinnamon 626b722
add test for when index is set from column
aram-cinnamon 70d6ee8
minor: change test function name
aram-cinnamon 5d77941
add test case
aram-cinnamon c92ec0b
move reset_index
aram-cinnamon 1ef4d3b
minor: add GH issue number to tests
aram-cinnamon 55fe96b
add test
aram-cinnamon 13f7414
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cinnamon 10836a2
push reset_index to _set_grouper
aram-cinnamon 62bdbdf
simplify
aram-cinnamon dd566f1
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cinnamon 7a0ad97
remove tests per comments
aram-cinnamon 612e275
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cinnamon bd91de4
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cinnamon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason we need to do a
.take
in theRangeIndex
case, but not otherwise?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, returning to this. I put in that condition to make this test pass:
def test_groupby_resample_on_api_with_getitem(self):
which was apparently added as part of #17813I have not had a chance to look too deeply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding
index=list("xyzwt")
to the DataFrame in that test makes the op fail.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the issue happens on
pandas.core.resample:1559
. We pass only the groupx
but the entire axisself.ax
. I wonder if splitting the axis as we dodata
on L967 there and passing it tof
would resolve.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rhshadrach, coming back to this, sorry for the hiatus. Can you elaborate on the above please? (I think maybe the word "not" is missing somewhere; I'm not entirely sure what you mean.) What file are you referring to re: L967? Can you give me a link to a line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did in my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in your previous comment you provided a link for this ^
I ask again because the link you provided in your previous comment does not contain
data
. Are you sure you didn't mean a different line here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to contain data? We pass the
func
from the highlighted line to groupby, which has a DataSplitter. This splitsdata
. My comment is that I wonder if we should be splittingaxis
as we do thisdata
on L967.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rhshadrach for your patience. I am new to making changes to groupby.
data
to be split downstream? Can you please elaborate on how the data is split? It would help greatly.axis
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure anymore and I doubt it would be helpful.
groupby splits data into groups and operates on each group.
x
in the code above is one of these groups, not the entire input data, whileself.ax
is the entire input's axis.I would suggest determining first if this is the right direction to go before spending time on figuring out how to do it. As I've stated, I'm not sure this is the root cause of the issue (or even an issue at all!).