-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: correct floating point ChunkHasher #5778
fix: correct floating point ChunkHasher #5778
Conversation
This adds documentation and a new hashCode method to the `io.deephaven.util.compare` package that is consistent with Deephaven equality. Specifically, this ensures that the floating points values -0.0 and 0.0 hash to the same value. Testing was added around aggregation keys, join keys, aggregations, min/max formulas, and sort results. Of note is that unique and distinct aggregations currently rely on `io.deephaven.chunk.WritableChunk#sort()`, which treats -0.0 < 0.0. This effects the encounter order returned by these aggregations, and could be seen as inconsistent with our stated goals of treating -0.0 == 0.0. This does not affect the consistency of the sort operation. It is hard to be confident that the testing coverage around this issue is fully complete; that said, this does make us much more consistent in the tested operations wrt Deephaven equality. This is a prerequisite for deephaven#5605 Fixes deephaven#3768
// | ||
package io.deephaven.util.compare; | ||
|
||
public class BooleanComparisons { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? We don't ever allow a column to contain primitive booleans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows us to not have to special case some boolean code generation cases; namely, BooleanChunkEquals
and BooleanChunkHasher
will now call this class (just like the other ChunkEquals / ChunkHasher impls). Maybe we can make the argument that BooleanComparisons
should only have eq
and hashCode
, and not compare
, gt
, etc...?
@@ -250,7 +251,7 @@ public void andEqualPairs(IntChunk<ChunkPositions> chunkPositionsToCheckForEqual | |||
|
|||
// region eq | |||
static private boolean eq(Object lhs, Object rhs) { | |||
return lhs == rhs; | |||
return ObjectComparisons.eq(lhs, rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Fixed. Surprised we didn't have any failing test.
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableNegativeZeroTest.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void testDhc3768_double_join_key() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rangeJoin and multiJoin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added multijoin. Spent a little time on rangeJoin, didn't quite grok the correct incantations.
} | ||
|
||
@Test | ||
public void testDhc3768_min_max_result() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List of aggs to test:
- min (looking to preserve encounter order)
- max (looking to preserve encounter order)
- percentile
- median
- count distinct
- distinct
- unique
- sorted first
- sorted last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've got these bases covered now.
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Outdated
Show resolved
Hide resolved
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Outdated
Show resolved
Hide resolved
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Show resolved
Hide resolved
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Outdated
Show resolved
Hide resolved
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Outdated
Show resolved
Hide resolved
engine/table/src/test/java/io/deephaven/engine/table/impl/QueryTableFloatingPointZeroTest.java
Show resolved
Hide resolved
|
||
@Test | ||
@Ignore("Unstable sort?") | ||
public void testFloatMedianAggResult() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably less well-defined than we'd hope. I bet that the SSM bucket gets created for -0.0 or 0.0, whichever is encountered first, and we'll keep using that "forever".
This adds documentation and a new hashCode method to the
io.deephaven.util.compare
package that is consistent with Deephaven equality. Specifically, this ensures that the floating points values -0.0 and 0.0 hash to the same value.Testing was added around aggregation keys, join keys, aggregations, min/max formulas, and sort results. Of note is that unique and distinct aggregations currently rely on
io.deephaven.chunk.WritableChunk#sort()
, which treats -0.0 < 0.0. This effects the encounter order returned by these aggregations, and could be seen as inconsistent with our stated goals of treating -0.0 == 0.0. This does not affect the consistency of the sort operation. It is hard to be confident that the testing coverage around this issue is fully complete; that said, this does make us much more consistent in the tested operations wrt Deephaven equality.This is a prerequisite for #5605
Fixes #3768