Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#6498: Make Fold operator more flexible #7257

Merged
merged 8 commits into from
May 14, 2024

Conversation

YarShev
Copy link
Collaborator

@YarShev YarShev commented May 13, 2024

What do these changes do?

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Does the Fold operator allow a function to change the shape of partitions? #6498
  • tests passing
  • module layout described at docs/development/architecture.rst is up-to-date

@YarShev YarShev force-pushed the dev/yigoshev-issue6498 branch from 259c1e7 to c715231 Compare May 13, 2024 10:08
@YarShev YarShev force-pushed the dev/yigoshev-issue6498 branch from ab262de to e2e8fcb Compare May 14, 2024 13:54
Comment on lines 2734 to 2759
def filter_modin_dataframe(df):
return df.__constructor__(
query_compiler=df._query_compiler.filter_func(
fold_axis=0,
new_index=new_index,
new_columns=new_columns,
)
)

pd.DataFrame.filter_dataframe = filter_modin_dataframe

filtered_df = modin_df.filter_dataframe()

df_equals(filtered_df, expected_df)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that filter_modin_dataframe2 will not work. Could you double check?

Suggested change
def filter_modin_dataframe(df):
return df.__constructor__(
query_compiler=df._query_compiler.filter_func(
fold_axis=0,
new_index=new_index,
new_columns=new_columns,
)
)
pd.DataFrame.filter_dataframe = filter_modin_dataframe
filtered_df = modin_df.filter_dataframe()
df_equals(filtered_df, expected_df)
def filter_modin_dataframe1(df):
return df.__constructor__(
query_compiler=df._query_compiler.filter_func(
fold_axis=0,
new_index=new_index,
new_columns=new_columns,
)
)
pd.DataFrame.filter_dataframe1 = filter_modin_dataframe1
filtered_df = modin_df.filter_dataframe1()
df_equals(filtered_df, expected_df)
def filter_modin_dataframe2(df):
return df.__constructor__(
query_compiler=df._query_compiler.filter_func(fold_axis=0)
)
pd.DataFrame.filter_dataframe2 = filter_modin_dataframe2
filtered_df = modin_df.filter_dataframe2()
df_equals(filtered_df, expected_df)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not work since I assumed that if new_index/columns are not passed, then fold function must preserve the shape. Do you think we should add a note about that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it is not always possible to know the index when the shape is saved. Let's add an additional flag that will say that the shape does not change. It will default to false so that users are less likely to get an error, on the other hand we can set this parameter to true for our functions that are already implemented through this operator. For example: shape_preserved: bool = False.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Done

anmyachev
anmyachev previously approved these changes May 14, 2024
@anmyachev
Copy link
Collaborator

@YarShev please resolve conflicts

YarShev and others added 8 commits May 14, 2024 20:07
Signed-off-by: Igoshev, Iaroslav <[email protected]>
Signed-off-by: Igoshev, Iaroslav <[email protected]>
Signed-off-by: Igoshev, Iaroslav <[email protected]>
Signed-off-by: Igoshev, Iaroslav <[email protected]>
Signed-off-by: Igoshev, Iaroslav <[email protected]>
@YarShev
Copy link
Collaborator Author

YarShev commented May 14, 2024

@YarShev please resolve conflicts

resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Does the Fold operator allow a function to change the shape of partitions?
2 participants