-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix SparkMurmurHash3_32 hash inconsistencies with Apache Spark (#7672)
#7024 added a Spark variant of Murmur3 hashing, but it is inconsistent with Apache Spark's hash calculations in a few areas: - `-0.0` and `0.0` are not treated the same by Apache Spark for floats and doubles - byte and short integral values are upcast to a 32-bit unsigned int (i.e.: zero-filled) before calculating the hash In addition libcudf allows hashing of timestamp columns but the JNI bindings asserted if timestamp columns were passed in, disabling the ability to hash on timestamps directly. Authors: - Jason Lowe (@jlowe) Approvers: - Nghia Truong (@ttnghia) - Jake Hemstad (@jrhemstad) - Alessandro Bellina (@abellina) - MithunR (@mythrocks) - Robert (Bobby) Evans (@revans2) URL: #7672
- Loading branch information
Showing
4 changed files
with
226 additions
and
38 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters