-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Original change here #3188 Why were we casting to "float64" in the old testcase? Maybe related to this comment? #3188 (comment)
- Loading branch information
Showing
2 changed files
with
15 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tl;dr: this change is correct and a good one.
This cast might have been needed in the past because
pd.concat([pdg1, pdg2])
turns the integer columns into float ones and fills the missing values withNaN
s:Conversely, cudf uses nullable dtypes in (basically all) cases and produces integer columns with nulls:
assert_eq
calls.to_pandas()
on the cudf dataframe and then calls the pandas testingassert_frame_equal
function. Ifto_pandas()
were to produce a nullableInt64
dtype for columns then the dtype check would fail. By casting to float64 first, this is avoided.As it happens, it's unnecessary, because
to_pandas()
by default produces non-nullable columns in the pandas dataframe, so converting the cudf dataframeto_pandas
will produce float64 columns anyway.