Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow pickle protocol to be overridden #4012

Closed
wants to merge 4 commits into from

Conversation

jakirkham
Copy link
Member

This provides the option to override the pickle protocol used. The default behavior is still to use the HIGHEST_PROTOCOL. However this provides users an easy way to customize the pickle protocol through the distributed.yaml config.

cc @mrocklin @quasiben

Provide users the option to override `protocol` in our `dumps` function.
If it is not specified, default to the `HIGHEST_PROTOCOL` just as
before.
Comment on lines +173 to +174
pickle:
protocol: null # specify the pickle protocol to use
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just placed this at the global level as it is not really worker or scheduler specific. Not picky on where it lives though.

@mrocklin
Copy link
Member

mrocklin commented Aug 3, 2020

Interesting. I'm curious, what was the motivation?

@jakirkham
Copy link
Member Author

Issue ( #4011 ) encouraged me to put it in a PR. Albeit that issue is discussing a different solution. Could also be useful for debugging/helping users.

@mrocklin
Copy link
Member

mrocklin commented Aug 4, 2020

Thanks for the speedy response to that issue. If we were to go this approach (which seems sensible regardless) I would then ask the question

what should we make the default? 4 or 5?

I suspect that your answer is likely "5, because it's faster" while I probably tend towards "4, because it's robust". I'm curious, did we ever speed test pickle protocol 5 on a real-world workload? Would something like x + x.T be made considerably faster by it? Given the cost around Python version mismatches it might be worth verifying that the performance benefit is felt in real-world situations

@jakirkham
Copy link
Member Author

Closing since this seems to be addressed by PR ( #4019 ).

@jakirkham jakirkham closed this Aug 7, 2020
@jakirkham jakirkham deleted the add_pickle_protocol_arg branch August 7, 2020 17:25
@mrocklin
Copy link
Member

mrocklin commented Aug 7, 2020

Woot. Thanks for the early effort @jakirkham

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants