[FEA] row-wise hashing using common hash functions like MD5 & SHA-2 #4989
Labels
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Spark
Functionality that helps Spark RAPIDS
Is your feature request related to a problem? Please describe.
We would like to use cudf to hash each row of a column, expanding the existing hash functionality to use other common hash functions like md5, sha2, etc.
Describe the solution you'd like
The existing hash functionality exists here:
https://github.com/rapidsai/cudf/blob/branch-0.14/cpp/include/cudf/hashing.hpp#L34
Ideally this would be enhanced to support an additional optional argument that specifies which hash function to use.
Additional context
This feature request is somewhat similar to #4913 but hashes each row rather than hashing an entire column to a single value.
The text was updated successfully, but these errors were encountered: