-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reformatting with yapf #74
base: cloned_master_f7fda
Are you sure you want to change the base?
Reformatting with yapf #74
Conversation
Clone of the PR modin-project/modin#58 |
My review is in progress 📖 - I will have feedback for you in a few minutes! |
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed your code and found 5 potential issues.
return pandas.to_datetime( | ||
arg, | ||
errors=errors, | ||
dayfirst=dayfirst, | ||
yearfirst=yearfirst, | ||
utc=utc, | ||
box=box, | ||
format=format, | ||
exact=exact, | ||
unit=unit, | ||
infer_datetime_format=infer_datetime_format, | ||
origin=origin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For scalar inputs when 'box=True', the function should return a Timestamp object instead of a DatetimeIndex. Consider modifying the return statement for scalar inputs to match pandas behavior: return pandas.Timestamp(arg) if box else pandas.to_datetime(arg, box=False)
.
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.
@pytest.fixture | ||
def test_df_concat(): | ||
df, df2 = generate_dfs() | ||
|
||
assert(ray_df_equals_pandas(pd.concat([df, df2]), | ||
pandas.concat([df, df2]))) | ||
assert (ray_df_equals_pandas( | ||
pd.concat([df, df2]), pandas.concat([df, df2]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_df_concat
function is currently defined as a fixture but appears to be intended as a test function. This can cause confusion and potentially lead to the test not being executed as expected. Consider removing the @pytest.fixture
decorator from this function if it's meant to be a test, or rename it and adjust its usage if it's intended to be a fixture.
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.
# Put all of the DataFrames into Ray format | ||
# TODO just partition the DataFrames instead of building a new Ray DF. | ||
objs = [DataFrame(obj) if isinstance(obj, (pandas.DataFrame, | ||
pandas.Series)) else obj | ||
for obj in objs] | ||
objs = [ | ||
DataFrame(obj) | ||
if isinstance(obj, (pandas.DataFrame, pandas.Series)) else obj | ||
for obj in objs | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To improve memory efficiency, avoid materializing full DataFrames from the input objects before partitioning them into a new Ray DataFrame. Instead, try to partition the input DataFrames directly where possible without creating intermediate full DataFrame copies. This will help reduce memory overhead, especially when dealing with large DataFrames.
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.
df = pandas.get_dummies( | ||
df, | ||
prefix=prefix, | ||
prefix_sep=prefix_sep, | ||
dummy_na=dummy_na, | ||
columns=None, | ||
sparse=sparse, | ||
drop_first=drop_first) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling around the calls to pandas.get_dummies()
in the get_dummies()
and get_dummies_remote()
functions. If an exception occurs within pandas.get_dummies()
, it will propagate up and could cause the program to crash. Wrap the calls in a try-except block to catch and handle any potential exceptions gracefully.
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.
exact=exact, | ||
unit=unit, | ||
infer_datetime_format=infer_datetime_format, | ||
origin=origin) | ||
if errors == 'raise': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation only handles 'raise' for the 'errors' parameter, but pandas.to_datetime() also supports 'coerce'. Consider updating the condition to handle 'coerce' as well, ensuring full compatibility with pandas functionality. This can be done by changing if errors == 'raise':
to if errors in ['raise', 'coerce']:
.
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.
/review |
PR Reviewer Guide 🔍
|
@coderabbitai review |
What do these changes do?
Related issue number
Resolves #54
git diff upstream/master -u -- "*.py" | flake8 --diff
Description by Korbit AI
What change is being made?
Integrate
yapf
for code formatting and reformat existing codebase usingyapf
.Why are these changes being made?
To ensure consistent code style and improve readability across the codebase by using an automated code formatter. This change also adds
yapf
to the CI pipeline to enforce formatting rules.