Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[crypto] add blake2b and blake2s functions #1081

Merged
merged 1 commit into from
Oct 10, 2021
Merged

Conversation

jimexist
Copy link
Member

@jimexist jimexist commented Oct 7, 2021

Which issue does this PR close?

add blake2b and blake2s functions, as provided in https://docs.rs/blake2/0.9.2/blake2/

must be merged after #1090 is merged.

Closes #1093
Closes #1094

Rationale for this change

blake2s is a better alternative for Md5 and blake2b is more secure, modern, etc.

What changes are included in this PR?

see https://en.wikipedia.org/wiki/BLAKE_(hash_function)

Are there any user-facing changes?

❯ cargo run -p datafusion-cli
    Finished dev [unoptimized + debuginfo] target(s) in 0.18s
     Running `target/debug/datafusion-cli`
DataFusion CLI v5.1.0-SNAPSHOT

> select blake2b(''), blake2s('');
+----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------+
| blake2b(Utf8(""))                                                                                                                | blake2s(Utf8(""))                                                |
+----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------+
| 786a02f742015903c6c6fd852552d272912f4740e15847618a86e217f71f5419d25e1031afee585313896444934eb04b903a685b1448b755d56f701afe9be2ce | 69217a3079908094e11121d042354a7c1f55b6482ca1a51e1b250dfd1ed0eef9 |
+----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------+
1 row in set. Query took 0.007 seconds.
>

@github-actions github-actions bot added ballista datafusion Changes in the datafusion crate labels Oct 7, 2021
@jimexist
Copy link
Member Author

jimexist commented Oct 7, 2021

so strange that i cannot reproduce the errors in CI

@xudong963
Copy link
Member

so strange that i cannot reproduce the errors in CI

Maybe CI doesn't run with the newest code

@@ -39,7 +39,7 @@ path = "src/lib.rs"
[features]
default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]
simd = ["arrow/simd"]
crypto_expressions = ["md-5", "sha2"]
crypto_expressions = ["md-5", "sha2", "blake2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for making these functions optional

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 there seems to be a build failures but otherwise this looks good to me

https://github.com/apache/arrow-datafusion/pull/1081/checks?check_run_id=3840276390

error[E0004]: non-exhaustive patterns: `&Blake2b` and `&Blake2s` not covered
   --> /home/runner/work/arrow-datafusion/arrow-datafusion/datafusion/src/physical_plan/functions.rs:319:15
    |
174 | / pub enum BuiltinScalarFunction {
175 | |     // math functions
176 | |     /// abs
177 | |     Abs,
...   |
275 | |     Blake2b,
    | |     ------- not covered
276 | |     /// Blake2s
277 | |     Blake2s,
    | |     ------- not covered
...   |
305 | |     RegexpMatch,
306 | | }
    | |_- `BuiltinScalarFunction` defined here
...
319 |           match self {
    |                 ^^^^ patterns `&Blake2b` and `&Blake2s` not covered
    |
    = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms
    = note: the matched value is of type `&BuiltinScalarFunction`

FWIW I looked at postgres and what it does is rather than having individual functions for different hashing algorithms, it appears to have a single digest function that takes the requested algorithm as a string parameter

F.26.1.1. Digest()
digest(data text, type text) returns bytea
digest(data bytea, type text) returns bytea

https://www.postgresql.org/docs/current/pgcrypto.html

@houqp
Copy link
Member

houqp commented Oct 9, 2021

so strange that i cannot reproduce the errors in CI

@jimexist are you using a rust toolchain with the same version that's being used in our CI?

@jimexist jimexist force-pushed the add-blake2 branch 2 times, most recently from 1518354 to def058f Compare October 9, 2021 09:14
@jimexist
Copy link
Member Author

jimexist commented Oct 9, 2021

🤔 there seems to be a build failures but otherwise this looks good to me

https://github.com/apache/arrow-datafusion/pull/1081/checks?check_run_id=3840276390

error[E0004]: non-exhaustive patterns: `&Blake2b` and `&Blake2s` not covered
   --> /home/runner/work/arrow-datafusion/arrow-datafusion/datafusion/src/physical_plan/functions.rs:319:15
    |
174 | / pub enum BuiltinScalarFunction {
175 | |     // math functions
176 | |     /// abs
177 | |     Abs,
...   |
275 | |     Blake2b,
    | |     ------- not covered
276 | |     /// Blake2s
277 | |     Blake2s,
    | |     ------- not covered
...   |
305 | |     RegexpMatch,
306 | | }
    | |_- `BuiltinScalarFunction` defined here
...
319 |           match self {
    |                 ^^^^ patterns `&Blake2b` and `&Blake2s` not covered
    |
    = help: ensure that all possible cases are being handled, possibly by adding wildcards or more match arms
    = note: the matched value is of type `&BuiltinScalarFunction`

FWIW I looked at postgres and what it does is rather than having individual functions for different hashing algorithms, it appears to have a single digest function that takes the requested algorithm as a string parameter

F.26.1.1. Digest()
digest(data text, type text) returns bytea
digest(data bytea, type text) returns bytea

https://www.postgresql.org/docs/current/pgcrypto.html

let's address this in #1090 so merge that first

@jimexist jimexist force-pushed the add-blake2 branch 2 times, most recently from cf9fb3a to ccf9b63 Compare October 10, 2021 08:13
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@alamb alamb merged commit 18769ce into apache:master Oct 10, 2021
@alamb
Copy link
Contributor

alamb commented Oct 10, 2021

Thanks @jimexist

@jimexist jimexist deleted the add-blake2 branch October 10, 2021 13:18
@houqp houqp added the enhancement New feature or request label Nov 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement blake2 functions Implement digest function to unify crypto functions
4 participants