Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builtin SHA256 hashing #6977

Draft
wants to merge 57 commits into
base: main
Choose a base branch
from
Draft

Conversation

MatthewJohnHeath
Copy link
Contributor

@MatthewJohnHeath MatthewJohnHeath commented Aug 9, 2024

Expose the functionality of Zig's std.crypto.hash.sha2.Sha256 through a pure-functional interface to Roc.

@lukewilliamboswell
Copy link
Collaborator

@MatthewJohnHeath I tried to push to your branch but I got permission denied. Made a PR in case that helps.

@MatthewJohnHeath
Copy link
Contributor Author

MatthewJohnHeath commented Aug 20, 2024 via email

@MatthewJohnHeath MatthewJohnHeath force-pushed the main branch 3 times, most recently from 344ac11 to 4da18e6 Compare August 30, 2024 14:49
@smores56
Copy link
Collaborator

The shape of this PR is a good start, but I'm seeing a potential issue if we add other cryptographic code to this module, which would be quite unsurprising. You have emptySha256, addBytes, and digest. The first is named based on the hash, but the second and third are nominally agnostic to the hash, but only work for SHA-256. Could you rename them to addBytesSha256 or something similar?

@smores56
Copy link
Collaborator

Otherwise, I'm attempting to get things running

@smores56
Copy link
Collaborator

Okay, I'm almost done getting this working, just need to coordinate the Zig and Roc builtin types.

@smores56
Copy link
Collaborator

The REPL worked for me! We'll still need some tests, but it should be good to go, above comments notwithstanding.

@smores56
Copy link
Collaborator

Also, there's no code currently to extract state from the new Digest256 opaque type. Could you add a function or something to do that? Maybe Crypt.digest256ToBytes : Digest256 -> List U8?

@MatthewJohnHeath
Copy link
Contributor Author

Also, there's no code currently to extract state from the new Digest256 opaque type. Could you add a function or something to do that? Maybe Crypt.digest256ToBytes : Digest256 -> List U8?

I was thinking of (only) having the type implement Eq. Can you think of use cases where the individual bytes would be needed?

@smores56
Copy link
Collaborator

smores56 commented Sep 2, 2024

I was thinking of (only) having the type implement Eq. Can you think of use cases where the individual bytes would be needed?

If I want to store the hash for future comparison. Not that using SHA-256 would be recommended over Argon2, but the idea holds. There are a few algorithms out there that use SHA-256 digest bytes for computation (Wikipedia), at least UUID v5 uses SHA-1 in that way.

@MatthewJohnHeath MatthewJohnHeath force-pushed the main branch 8 times, most recently from f7fc408 to c9ab4e5 Compare September 4, 2024 20:55
@smores56
Copy link
Collaborator

smores56 commented Sep 9, 2024

Tests are currently failing. @MatthewJohnHeath do you have time to implement the above fixes any time soon? If not, I can help, but it's your PR, so I'll defer to you.

@MatthewJohnHeath
Copy link
Contributor Author

Tests are currently failing. @MatthewJohnHeath do you have time to implement the above fixes any time soon? If not, I can help, but it's your PR, so I'll defer to you.

@smores56 It's not really clear to me why the build is hanging (which seems to be what is happening). I will try a few things, but might need to ask for help

@MatthewJohnHeath
Copy link
Contributor Author

@smores56 It's building now. Hopefully I can do those last 2 renamings of functions you asked for tomorrow.

@MatthewJohnHeath MatthewJohnHeath force-pushed the main branch 2 times, most recently from b2f8ca2 to fcee861 Compare September 11, 2024 17:25
@MatthewJohnHeath
Copy link
Contributor Author

If this gets through the tests, would it be to merge and then do docs and tests? Or do those need to be there first?

@smores56
Copy link
Collaborator

It's always preferable to do at least testing, but ideally also docs, in the same PR. We'd usually avoid them if they're going to make the PR too big, or if they're blocking us merging an important change/bug fix. Since it's neither of those, I'd prefer if we can get them in this change. I'm happy to help with either the tests or docs!

Sha256 := { location : U64 }

## Represents the digest of soem data produced by the SHA256 cryptographic hashing function as an opaque type.
## `Digest256`implements the `Eq` ability.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Digest256(space)implements ...

also, consider adding an empty doc comment between these two doc comments, that'll follow the "one-line summary on top, context beneath" structure for docs.

Digest256 := { firstHalf : U128, secondHalf : U128 } implements [Eq]

## Returns a `Sha256` to which no data have been added.
emptySha256 : {} -> Sha256
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend "An empty SHA-256 digest."

And it's "to which no data has been added." https://www.thesaurus.com/e/grammar/data-is-or-data-are/

Copy link
Contributor Author

@MatthewJohnHeath MatthewJohnHeath Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word shouldn't be "digest", I think. We should save that for the Digest256 returned after "finalize" has been called on the zig object.
I am struggling to find a good term to describe what the Sha256 object does. My mental model of it is as the state of the algorithm, but that doesn't seem useful wording here. Something like "An empty SHA-256 hasher", maybe?

I agree that data (EDIT)is in this context; I've had too much time in settings where I get corrected the other way.

import Result
import Str

## Represents, as an opaque type, the state of a SHA256 cryptographic hashing function, after some (or no) data have been added to the hash.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: it being an opaque type is implied, I don't think we need to explicitly mention that here.

smores56
smores56 previously approved these changes Sep 13, 2024
@MatthewJohnHeath
Copy link
Contributor Author

I'm having some issues with memory running tests locally. There seems to be a failure in crates/repl_test/src/tests.rsand I can't see how it would be related

@MatthewJohnHeath
Copy link
Contributor Author

I am having some trouble running test locally (insufficient disc space). There seems to be a failure in interpolation_with_nested_strings. I can't see how that could be related.

@MatthewJohnHeath
Copy link
Contributor Author

I am having some trouble running test locally (insufficient disc space). There seems to be a failure in interpolation_with_nested_strings. I can't see how that could be related.

Oh I think it must be related. I didn't realise how early in the process that test was. I will try to unpick it sometime this week

@MatthewJohnHeath MatthewJohnHeath marked this pull request as draft December 9, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants