-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/ottl] Add MurmurHash3 converter #34155
Conversation
string of the murmurHash3 hash/digest Fix: open-telemetry#34077
Hello @kaisecheng, thanks for your PR! Just to let you know, I'm going to wait to review until the code owners of OTTL get a chance to triage the original issue to make sure it makes sense to add this functionality. |
…emetry-collector-contrib into ottl_murmurhash3_func
pkg/ottl/go.mod
Outdated
@@ -10,6 +10,7 @@ require ( | |||
github.com/json-iterator/go v1.1.12 | |||
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.105.0 | |||
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatatest v0.105.0 | |||
github.com/spaolacci/murmur3 v1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned in general about using spaolacci/murmur3
. It has some open issues and PRs, but it appears to be dead at this point. Do you have thoughts of potentially using another package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
twmb/murmur3 could be an alternative. Initially, I was concerned about the license since it is not listed as BSD-3-Clause. However, after a closer look, it seems acceptable. I did a quick test, and it appears that the hash result from Sum128()
requires the same byte flip as the current solution.
I don't see a significant benefit in switching the library, but I am open to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern I have with `spaolacci/murmur3 - there is an outstanding issue, which I believe is the likely cause of some flaky test failures on previous CI runs in this PR.
AFAICT, the twmb/murmur3
is a fork initially created to specifically fix this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robbavey you are right. will change to twmb/murmur3
|
||
version := v128 | ||
if !args.Version.IsEmpty() { | ||
if (args.Version.Get() != v32) && (args.Version.Get() != v128) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It might make sense to make this a switch statement with the error in the default case. This would allow the exhaustive linter to ensure we don't miss adding a new valid value in case of future changes.
v32 = "32" | ||
v128 = "128" // default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make these enum values? (To go along with my other nit about a switch statement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an enum means the syntax to call the 32-bit version will be MurmurHash3("something", version=0)
. I think this is more confusing than MurmurHash3("something", version="32")
.
|
||
version := v128 | ||
if !args.Version.IsEmpty() { | ||
if (args.Version.Get() != v32) && (args.Version.Get() != v128) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It might make sense to just put args.Version.Get()
into a new variable here, and check that value, and set version
to it if it's valid, rather than calling the method so many times for the same value.
|
||
// MurmurHash3HexString returns the hexadecimal representation of the hash in little-endian format. | ||
// MurmurHash3, developed by Austin Appleby, is sensitive to endianness. Unlike some other languages like Python, | ||
// which use little-endian for all architectures, the Go library `spaolacci/murmur3` has some open issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to directly reference the open issues being referenced. That way in the future we could remove this if they're ever fixed, and just to be able to quickly see the underlying issue.
Co-authored-by: Curtis Robert <[email protected]>
Co-authored-by: Curtis Robert <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to continue discussing, via the issue, whether or not we need this function.
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
If `version` is | ||
|
||
- `v32_hash`: Uses 32-bit version and returns a signed integer hash. | ||
- `v128_hash`: Use 128-bit version and returns an array of two signed integer hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `v128_hash`: Use 128-bit version and returns an array of two signed integer hash. | |
- `v128_hash`: Uses 128-bit version and returns an array of two signed integer hash. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description:
Add
MurmurHash3
function to convert thetarget
to a hexadecimal string of the murmurHash3 hash/digestThe hash result has cross checked with python and ruby
Link to tracking Issue:
#34077
Testing:
Documentation: