Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Provide shallow hash function for column_view #9140

Closed
jrhemstad opened this issue Aug 27, 2021 · 2 comments · Fixed by #9185 or #9312
Closed

[FEA] Provide shallow hash function for column_view #9140

jrhemstad opened this issue Aug 27, 2021 · 2 comments · Fixed by #9185 or #9312
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@jrhemstad
Copy link
Contributor

Is your feature request related to a problem? Please describe.

I want to hash all of the "shallow" state of a column_view into a single value.

Describe the solution you'd like

I would like a constant time hash function that hashes all of the shallow state of a column_view such that for two column_views a and b then:

is_shallow_equal(a,b) => shallow_hash(a) == shallow_hash(b)

See also #9139

@jrhemstad jrhemstad added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. labels Aug 27, 2021
@jrhemstad jrhemstad added this to the Time Series Analysis milestone Aug 30, 2021
@karthikeyann
Copy link
Contributor

karthikeyann commented Sep 6, 2021

Should shallow_hash be a public or detail API or only in src/ ?

@jrhemstad
Copy link
Contributor Author

detail:: API.

rapids-bot bot pushed a commit that referenced this issue Sep 22, 2021
…view (#9185)

Fixes #9140 
Added `shallow_hash(column_view)`
Added unit tests

It computes hash values based on the shallow states of `column_view`:
type, size, data pointer, null_mask pointer,  offset, and the hash value of the children. 
`null_count` is not used since it is a cached value and it may vary based on contents of `null_mask`, and may be pre-computed or not.

Fixes #9139
Added `is_shallow_equivalent(column_view, column_view)` ~shallow_equal~
Added unit tests

It compares two column_views based on the shallow states of column_view:
type, size, data pointer, null_mask pointer, offset, and the column_view of the children.
null_count is not used since it is a cached value and it may vary based on contents of null_mask, and may be pre-computed or not.

Authors:
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Jake Hemstad (https://github.com/jrhemstad)
  - David Wendt (https://github.com/davidwendt)

URL: #9185
rapids-bot bot pushed a commit that referenced this issue Sep 27, 2021
…view (#9312)

Fixes #9140 
Added `shallow_hash(column_view)`
Added unit tests
SWIPAT approval complete

It computes hash values based on the shallow states of `column_view`:
type, size, data pointer, null_mask pointer,  offset, and the hash value of the children. 
`null_count` is not used since it is a cached value and it may vary based on contents of `null_mask`, and may be pre-computed or not.

Fixes #9139
Added `is_shallow_equivalent(column_view, column_view)` ~shallow_equal~
Added unit tests

It compares two column_views based on the shallow states of column_view:
type, size, data pointer, null_mask pointer, offset, and the column_view of the children.
null_count is not used since it is a cached value and it may vary based on contents of null_mask, and may be pre-computed or not.

Authors:
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #9312
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
2 participants