-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] csv writer returning full-subsecond data incase of duration types #6660
Comments
@galipremsagar does Pandas always trim the subsecond data, or only when the element has a value of 0 for the subsecond data? |
@kkraus14 Pandas trims subsecond data only when it's subsecond data is 0 |
I don't know if this is something we can match or even should match then. Is Pandas able to read how we write the data successfully? |
This looks possible to do. cudf/cpp/src/io/csv/durations.cu Line 225 in e0e2cf8
|
I don't think this is something we should prioritize unless someone comes along and provides sufficient reasoning for why it's an issue for them. |
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d. |
This issue has been labeled |
Describe the bug
When there is a column with
timedelta64
dtype and there is no sub-second data for a particular value, pandas doesn't output sub-second data as part of csv string output. But, whereas in cudf we are currently converting all values of the column with sub-second data.Steps/Code to reproduce bug
Expected behavior
I think we should be matching the pandas behavior here, because when when we read back the csv file cudf or pandas infer it as a string column and will ultimately result in storing extra values where not necessary.
Environment overview (please complete the following information)
Environment details
Please run and paste the output of the
cudf/print_env.sh
script here, to gather any other relevant environment detailsClick here to see environment details
Additional context
Surfaced while running fuzz tests: #6001
The text was updated successfully, but these errors were encountered: