-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add murmur2_random
support for message partitioning.
#884
Conversation
lib/kafka/client.rb
Outdated
Object.const_get(partitioner_klass).new | ||
else | ||
Partitioner.new | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should introduce a new concept next to Partitioner
. Instead, Partitioner
should be configured with the hash/digest algorithm, with the default being crc32
but murmur2
being an option. The "random" part would just always be the behavior when no key is present.
So: can we introduce a new parameter to Partitioner#initialize
called e.g. hash_function
, which allows either crc32
(default) or murmur2
. Clients that wish to use murmur2
would need to instantiate Partitioner
with that value and pass the instance to producer
.
In Racecar and other high-level frameworks, we can add configuration keys for controlling this declaratively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. I was taking inspiration from librdkafka
but your idea will also work well with the opt-in behaviour below.
Just to be clear, you would like the Partitioner
instance to be passed in, ala sasl_oauth_token_provider
, and not the hash_function
as a symbol ala compression_codec
? I'm only asking because the latter seems like a better fit here, but I don't have much experience writing gems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the former, since we're talking about varying the hash algo used by Partitioner, but not changing its fundamental strategy. Also, I prefer a minimal change in API – the high level libraries can easily make this a config key without any extra APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for breaking it down for me. I've removed the API change and moved it inside the Partitioner
ruby-kafka.gemspec
Outdated
@@ -28,6 +28,7 @@ Gem::Specification.new do |spec| | |||
spec.require_paths = ["lib"] | |||
|
|||
spec.add_dependency 'digest-crc' | |||
spec.add_dependency "digest-murmurhash" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the dependencies of this gem? The dependency policy for ruby-kafka is pretty strict, so for opt-in behavior the default would be to give an error message when the library cannot be detected and ask the user to install it themselves (would love optional dependencies in gems, but 🤷)
If the gem is really trivial and doesn't include C code then we can add it as a hard dep, but only then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see how the various compression codecs are used. Would a similar setup to that be ok here?
The gem is fairly trivial but it's basically just C code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's C code then I think it should be opt-in, so that would be a good change 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
And all the plumbing to load it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Can you also 1) update the README to document this, and perhaps add docs to Partitioner
describing the setup, and 2) update the CHANGELOG?
@dasch Thank you, and yes of course. I'm still thinking about the naming of the |
@dasch I've updated the documentation, let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
Add
murmur2_random
support for message partitioning compatibility with Java producers.I'm not 100% certain on the new configuration parameter. The caller shouldn't need to know how to instantiate the partitioner, so this is what I came up with. Open to suggestions.
The test failures seem to be unrelated to the changes. Not sure what to do there.
@dasch