-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[h7Q1epCi] Make hashing functions stable on all neo4j types #573
Conversation
0e69d71
to
e9c53e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, just some minor questions and remarks.
try (final var tx = db.beginTx()) { | ||
final var node = tx.createNode(Label.label("HashFunctionsAreStable")); | ||
final var rel = node.createRelationshipTo(node, RelationshipType.withName("R")); | ||
for (int i = randStorables.size() - 1; i >= 0; --i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason why this loop is going backwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! {a: 1, b: 2}
should have the same hash as {b: 2, a: 1}
, before we did toString, and that will simply use the iteration order (which depends on the map implementation). I wanted to try to make the properties come in another order (not sure it succeeded, depends on the map implementation and what kernel decides to do).
final var first = a.get(0).entrySet().iterator().next(); | ||
for (final var e : a.get(0).entrySet()) { | ||
assertEquals(message, first.getValue(), e.getValue()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the point of this assertion? To me it looks like you are comparing something to itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can, it's a bit messy. Let's look at an example:
We might run this query with $value=23, $list=List(23)
and n.p1=23
:
match (n)
return
apoc.util.md5([n.p1]) as hash1,
apoc.util.md5([$value]) as hash2
apoc.util.md5($list) as hash3
This assertion is intended to make sure hash1
, hash2
and hash3
have the same value. They should all equal each other, but in Cypher there can be subtle differences depending on where a value comes from. For example create (n {p: [1,2,3]}) return n.p
will most probably not have exactly the same types internally as return [1,2,3]
even though the two lists are equal and represents the same thing. Just wanted to make sure that such internal types do not affect the hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for explaining!
* This is not the most efficient way to produce a hash, but it is backwards compatible. | ||
* This function is only intended to be used on strings (as documented on the hash functions above) | ||
* But it turns out that is not how everyone is using it, so as a safety we have stable implementations | ||
* for all neo4j types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that you managed to keep backwards compatibility. Do you think we should mention something about the other types than strings in docs or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, perhaps we should point out that it is only meant for strings and other types can have bad performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now
Make hashing functions stable on all neo4j types. These functions are only intended for strings, but unfortunately their signature allows all types. This is a safety for users that use them for other values.
The hash functions are only intended for strings, but unfortunately their signature allows all types. This is a safety for users that use hash functions on non string values despite the documentation.