Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly handle array key cols in query operations #5605

Open
rbasralian opened this issue Jun 11, 2024 · 3 comments
Open

Properly handle array key cols in query operations #5605

rbasralian opened this issue Jun 11, 2024 · 3 comments
Assignees
Labels
core Core development tasks feature request New feature or request June2024 query engine
Milestone

Comments

@rbasralian
Copy link
Contributor

In something like the below, the equivalent arrays should be treated as equal and aggregated/joined accordingly. (It already seems to work for vectors.)

arr1 = new String[] { 'abc', 'def'}
arr2 = new String[] { 'abc', 'def'}
myTable_arr = emptyTable(2).update("KeyCol = i==0 ? arr1 : arr2", "X = ii + 1")
mySum_arr = myTable_arr.sumBy("KeyCol")
myJoin_arr = myTable_arr.where("i==0").naturalJoin(myTable_arr.where("i==1"), "KeyCol", "X_joined=X")


vec1 = new io.deephaven.vector.ObjectVectorDirect(arr1)
vec2 = new io.deephaven.vector.ObjectVectorDirect(arr2)
myTable_vec = emptyTable(2).update("KeyCol = i==0 ? vec1 : vec2", "X = ii + 1")
mySum_vec = myTable_vec.sumBy("KeyCol")
myJoin_vec = myTable_vec.where("i==0").naturalJoin(myTable_vec.where("i==1"), "KeyCol", "X_joined=X")

image

(https://deephaven.atlassian.net/browse/DH-11038 in Legacy engine)

@rbasralian rbasralian added feature request New feature or request triage labels Jun 11, 2024
@rcaudy rcaudy added query engine core Core development tasks and removed triage labels Jun 12, 2024
@rcaudy rcaudy added this to the June 2024 milestone Jun 12, 2024
@rcaudy rcaudy self-assigned this Jun 12, 2024
@devinrsmith
Copy link
Member

I think this is due to io.deephaven.chunk.util.hashing.ObjectChunkEquals not using deepEquals. As part of a PR that will merge soon, #5225, I've added a new version that uses deepEquals. We'll still need to plumb it properly to get it to work here.

@devinrsmith
Copy link
Member

It looks like DHE has moved to a per array-type solution to this (as opposed to the more generic deepEquals). This seems like a good solution for DHC as well, as deepEquals is potentially much less efficient.

devinrsmith added a commit to devinrsmith/deephaven-core that referenced this issue Jun 12, 2024
devinrsmith added a commit to devinrsmith/deephaven-core that referenced this issue Jul 16, 2024
This adds documentation and a new hashCode method to the `io.deephaven.util.compare` package that is consistent with Deephaven equality. Specifically, this ensures that the floating points values -0.0 and 0.0 hash to the same value.

Testing was added around aggregation keys, join keys, aggregations, min/max formulas, and sort results. Of note is that unique and distinct aggregations currently rely on `io.deephaven.chunk.WritableChunk#sort()`, which treats -0.0 < 0.0. This effects the encounter order returned by these aggregations, and could be seen as inconsistent with our stated goals of treating -0.0 == 0.0. This does not affect the consistency of the sort operation. It is hard to be confident that the testing coverage around this issue is fully complete; that said, this does make us much more consistent in the tested operations wrt Deephaven equality.

This is a prerequisite for deephaven#5605

Fixes deephaven#3768
devinrsmith added a commit that referenced this issue Jul 18, 2024
This adds documentation and a new hashCode method to the
`io.deephaven.util.compare` package that is consistent with Deephaven
equality. Specifically, this ensures that the floating points values
-0.0 and 0.0 hash to the same value.

Testing was added around aggregation keys, join keys, aggregations,
min/max formulas, and sort results. Of note is that unique and distinct
aggregations currently rely on
`io.deephaven.chunk.WritableChunk#sort()`, which treats -0.0 < 0.0. This
effects the encounter order returned by these aggregations, and could be
seen as inconsistent with our stated goals of treating -0.0 == 0.0. This
does not affect the consistency of the sort operation. It is hard to be
confident that the testing coverage around this issue is fully complete;
that said, this does make us much more consistent in the tested
operations wrt Deephaven equality.

This is a prerequisite for #5605

Fixes #3768
@pete-petey pete-petey modified the milestones: June 2024, Backlog Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core development tasks feature request New feature or request June2024 query engine
Projects
None yet
Development

No branches or pull requests

4 participants