Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: try_cast_to_pandas doesn't preserve dtypes for empty frame #4934

Closed
anmyachev opened this issue Sep 6, 2022 · 4 comments
Closed

BUG: try_cast_to_pandas doesn't preserve dtypes for empty frame #4934

anmyachev opened this issue Sep 6, 2022 · 4 comments
Labels
bug 🦗 Something isn't working P2 Minor bugs or low-priority feature requests

Comments

@anmyachev
Copy link
Collaborator

Reproducer:

def test_try_cast_to_pandas_preserve_dtypes():
    modin_df = pd.DataFrame({"col": [1]}, dtype=np.int64)
    pandas_df = pandas.DataFrame({"col": [1]}, dtype=np.int64)
    modin_df.query("col > 2", inplace=True), pandas_df.query("col > 2", inplace=True)
    casted_to_pandas_df = modin.utils.try_cast_to_pandas(modin_df)
    df_equals(casted_to_pandas_df, pandas_df)
    df_equals(casted_to_pandas_df.dtypes, pandas_df.dtypes)

Error:

E   AssertionError: Series are different
E   
E   Series values are different (100.0 %)
E   [index]: [col]
E   [left]:  [object]
E   [right]: [int64]
@anmyachev anmyachev added the bug 🦗 Something isn't working label Sep 6, 2022
@anmyachev anmyachev self-assigned this Sep 6, 2022
anmyachev added a commit to anmyachev/modin that referenced this issue Sep 6, 2022
@mvashishtha
Copy link
Collaborator

@anmyachev @Billy2551 is working on #4605. In case you don't need this fixed urgently, you could skip the fix and wait for a comprehensive fix, which should hopefully come within 3 months, for #4605.

@anmyachev
Copy link
Collaborator Author

@anmyachev @Billy2551 is working on #4605. In case you don't need this fixed urgently, you could skip the fix and wait for a comprehensive fix, which should hopefully come within 3 months, for #4605.

Good! My other PR highlights this problem in CI, then I'll just mark it as xfail.

@anmyachev
Copy link
Collaborator Author

From what I've seen so far, from_pandas function doesn't save dtypes of the empty dataframe either (because the empty dataframe doesn't get into put function).

@anmyachev anmyachev removed their assignment Sep 7, 2022
@mvashishtha mvashishtha added the P2 Minor bugs or low-priority feature requests label Sep 7, 2022
@anmyachev
Copy link
Collaborator Author

Problem not reproducible on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working P2 Minor bugs or low-priority feature requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants