-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defaulting to_csv to infer compression #22004
Comments
I agree conceptually. Probably need to handle cases where this would potentially conflict with the |
I am happy to open a PR. I think the solution will be as simple as changing the compression default to infer in: Lines 1714 to 1716 in 322dbf4
Looks like Line 11 in 322dbf4
Lines 29 to 32 in 322dbf4
I don't think the other |
This issue follows up on #17900 by thanks @Dobatymo and @gfyoung with review from @jreback. #17900 added an
'infer'
option to compression in_get_handle
. The main user-facing benefit here is thatdf.to_csv
will be able to infer compression just likepandas.read_csv
. However, unlikeread_csv
the default value for compression isNone
rather than'infer'
Unfortunately, much of the convenience of
compression='infer'
is lost if you have to explicitly specify it. In summary, I think there is a major convenience to the following command to work and automatically perform gzip compression:Compatibility assessment
Defaulting to infer would only affect users who are currently using paths with compression extensions but not actually compressing. That's pretty bad practice IMO. Hence, I'm in favor of breaking backwards compatibility and changing the default for compression to infer. It looks like this would go into the major release 0.24?
The text was updated successfully, but these errors were encountered: