Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHIP-0034: keccak and base64 operators #116

Closed
wants to merge 11 commits into from
183 changes: 183 additions & 0 deletions CHIPs/chip-0034.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
CHIP Number | 0034
:-------------|:----
Title | keccak256 and base64url CLVM operators
Description | Add CLVM operators to support Ethereum addresses (keccak256) and passkeys (base64url).
Author | [Cameron Cooper](https://github.com/cameroncooper), [Arvid Norberg](https://github.com/arvidn), [Dan Perry](https://github.com/danieljperry)
Editor | [Freddie Coleman](https://github.com/freddiecoleman)
Comments-URI | [CHIPs repo, PR #116](https://github.com/Chia-Network/chips/pull/116)
Status | Draft
Category | Standards Track
Sub-Category | Chialisp
Created | 2024-04-26
Requires | None
Replaces | None
Superseded-By | None

## Abstract

This CHIP will add three new operators to the CLVM:
* `keccak256` -- for supporting Ethereum addresses
* `base64url_encode` -- for Base64 encoding with a URL- and filename-safe alphabet, to support passkeys
* `base64url_decode` -- for Base64 decoding; assumes the input uses Base64 encoding with a URL- and filename-safe alphabet

The operators will be accessible from behind the `softfork` guard. Hypothetically, in the event of a planned hard fork, the operators could be added to the core CLVM.

## Definitions

Throughout this document, we'll use the following terms:
* **Chialisp** - The high-level [programming language](https://chialisp.com/) from which Chia coins are constructed
* **CLVM** - [Chialisp Virtual Machine](https://chialisp.com/clvm), where the bytecode from compiled Chialisp is executed. Also commonly refers to the compiled bytecode itself
* **Keccak-256** - The same hashing standard Ethereum uses. A few notes on this standard:
* The Keccak-256 standard referred to in this CHIP is specifically version 3 of the [winning submission](https://keccak.team/files/Keccak-submission-3.pdf) to the NIST SHA-3 contest by Bertoni, Daemen, Peeters, and Van Assche in 2011
* This standard is _not_ the same as the final [SHA-3 standard](https://en.wikipedia.org/wiki/SHA-3), which NIST released in 2015
* The EVM uses an opcode called `SHA3`, and Solidity has an instruction called `sha3`. However, these _do not_ refer to the final SHA-3 standard. They _do_ refer to the the same `Keccak-256` standard used in this CHIP
* In order to avoid confusion, we will not use the terms `SHA3`, `sha3`, or `SHA-3` further in this CHIP. Instead, the terms `Keccak-256` and `keccak256` will be used to denote the cryptographic standard used in Ethereum
* **Base64** -- A standard for transforming binary data into a sequence of printable characters, drawn from a set of 64 unique characters. Note that there are multiple Base64 alphabets; this CHIP will use the one commonly referred to as `base64url`, as described in the [Rationale](#rationale) section

## Motivation

CLVM currently includes an atomic operator called [`sha256`](https://chialisp.com/operators/#atoms). This operator calculates and returns the SHA-256 hash of the input atom(s).

This CHIP will add an atomic operator to the CLVM called `keccak256`, which will calculate and return the Keccak-256 hash of the input atom(s). The primary reason to add this operator is to support Ethereum addresses, which also rely on Keccak-256.

This CHIP will also add operators to the CLVM called `base64url_encode` and `base64url_decode`, which will calculate and return the base64url encoding/decoding of the input atoms(s). The primary reason to add the `base64url_encode` operator is to support passkeys that rely on base64url encoding. `base64url_decode` is being added for completeness.

For more information about the passkeys this CHIP intends to support, see the following specifications:
* [W3C Web Authentication](https://www.w3.org/TR/webauthn-2/)
* [Client to Authenticator Protocol (CTAP)](https://passkeys.dev/docs/reference/specs/)

Note that each of this CHIP's operators could have other use cases not described here.

## Backwards Compatibility

If this CHIP is accepted, the new operators described within will be added to the CLVM. If the operators could be called directly, they would break compatibility, and therefore would require a hard fork of Chia's blockchain. However, CLVM includes a [softfork operator](https://chialisp.com/operators/#softfork) specifically to define new CLVM operators without requiring a hard fork. This CHIP will use the `softfork` operator, which itself requires a soft fork of Chia's blockchain. After the the fork's activation, the operators will need to be called from inside the `softfork` guard. They will use extension `1`, as detailed in the [softfork usage](#softfork-usage) section.

As with all forks, there will be a risk of a chain split. The soft fork could also fail to be adopted. This might happen if an insufficient number of nodes have upgraded to include the changes introduced by this CHIP prior to the fork's block height.

The operators will be introduced in multiple phases:
* **Pre-CHIP**: Prior to block `[todo]`, the new operators will be undefined. Any attempt to call them will return `NIL`.
* **Soft fork**: A soft fork will activate at block `[todo]`. From that block forward, the new operators will exhibit the functionality laid out in this CHIP. They will need to be called from inside the `softfork` guard.
* **Hard fork** (hypothetical): In the event that a hard fork is enacted after the code from this CHIP has been added to the codebase, this hypothetical hard fork could include adding the operators from this CHIP to the core CLVM operator set. If this were to happen, the operators could be also be called from outside the `softfork` guard. (Note that the operators would still be callable from inside the `softfork` guard if desired.)

## Rationale

This CHIP's design was primarily chosen to support Ethereum addresses and passkeys. It was implemented in a manner consistent with the Keccak-256 and Base64 standards.

While the keccak256 and base64url operators are not related, they each require soft forks in order to be made available from inside the `softfork` guard. Therefore, they were grouped together in this CHIP, to be activated with the same soft fork.

Each of the new operators will incur a CLVM cost, as detailed below. If this CHIP is adopted, the new operators will be optional when designing Chia coins.

The Base64 alphabet used in this CHIP is specified in [Section 5 of the RFC 4648 Standard](https://www.rfc-editor.org/rfc/rfc4648.html#section-5), and is referred to as `base64url`. This is also the alphabet used by the W3C in their [Web Authentication standard](https://www.w3.org/TR/webauthn-2/#base64url-encoding).

Each Base64 character is encoded with six bits, and decoded to eight bits. Therefore, the ending of some Base64 output strings can use [padding](https://en.wikipedia.org/wiki/Base64#Output_padding). Some Base64 alphabets require padding, and others make padding optional. In the specific case of `base64url`, padding is optional. The W3C chose to omit the padding, so we have chosen to do the same. The `base64url` operators in this CHIP _do not_ add or assume any padding.

## Specification

### `keccak256`

Opcode: [todo]

Functionality: Calculate and return the Keccak-256 hash of the input atom(s)
arvidn marked this conversation as resolved.
Show resolved Hide resolved

Arguments:
* If zero arguments, the Keccak-256 hash of an empty string will be returned
* If one or more arguments:
1. Each argument (if more than one) will be concatenated
2. The Keccak-256 hash of the resulting string will be returned

Usage: `(keccak256 A B …)`

CLVM Cost: `[todo]` base, `[todo]` per argument, `[todo]` per byte

### `base64url_encode`

Opcode: [todo]

Functionality: Calculate and return the base64url encoding of the input atom(s)

Arguments:
* If zero arguments, the result will be the base64url encoding of an empty string (`NIL`), in other words, the result will be `NIL`
* If one or more arguments:
1. Each argument (if more than one) will be concatenated
2. The base64url encoding (following [Section 5 of the RFC 4648 Standard](https://www.rfc-editor.org/rfc/rfc4648.html#section-5) with no padding) of the resulting string will be returned

Usage: `(base64url_encode A B …)`

CLVM Cost: `[todo]` base, `[todo]` per argument, `[todo]` per byte

### `base64url_decode`

Opcode: [todo]

Functionality: Return the decoded form of the `base64url` input string

Arguments:

Exactly one argument is required. If there is not exactly one argument, an exception will be raised.

The argument must be in `base64url` format, encoded according to [Section 5 of the RFC 4648 Standard](https://www.rfc-editor.org/rfc/rfc4648.html#section-5). It must not be padded with `=` or any other characters. If the argument is not encoded in `base64url` format, or if it includes any padding, or if it includes any additional characters that are not part of the `base64url` alphabet, an exception will be raised.

Usage: `(base64url_decode A)`

CLVM Cost: `[todo]` base, `[todo]` per argument, `[todo]` per byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for base64 encoding & decoding? I understand it's something to do with ETH. Where can I find more details?

Including base64 data instead of the corresponding binary data on chain seems like a mistake, since reading base64 data would come from a computer before being dumped on chain, and there are plenty of opportunities for the reading client to decode before dumping on chain. Maybe ETH does this, due to some bad design somewhere. But I worry we're proposing changes to clvm to make simpler interoperability with bad design on ETH or elsewhere, which just encourages people to repeat those design errors on chia.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base64 is in support of passkey support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a signature must be validated against a base64 message, but you need to validate the message matches something within the puzzle, you have two options:

  • Give the puzzle the encoded base64 message and decode it, then check it matches
  • Give the puzzle the original message, check it matches, then base64 encode it before validating the signature

Both of these require base64 opcodes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keccak is for Ethereum address compatibility - it allows an application to construct a Chia puzzle that is only spendable by whoever owns the private key corresponding to the Ethereum address. Which means you can send XCH to an Ethereum address with a little bit of wrapping behind the scenes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And base64 allows you to have a puzzle that can only be spent by including a signature created with a passkey (meaning you don't need to have access to the original secret key, just the signer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn’t have anything to do with ETH.

When passkeys are used via the Web Authentication API, they always encode the challenge (message) with base64 before signing it.

That’s why we need to encode the message in the puzzle too, so we can validate the passkey signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greimela Appreciate the response. Do you have a reference where I can learn more about this? It seems surprising to me that they couldn't sign binary data directly and I'm wondering what possible motivation there could be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted some links here: #116 (comment) and here: #116 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, those are the ones!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at this more today and I think I understand this better now. WebAuthn apis sign JSON messages of a certain format, ie. certain keys must be present and must have specific formats, so that you don't sign arbitrary messages with arbitrary keys. For details, see https://www.w3.org/TR/webauthn-2/#client-data

So the thing that's signed is a JSON blob which includes a challenge: key, which is server-determined. In the prototype, we hack it to be a hash of the target conditions for the coin. Validating requires checking the signature, which is against the entire JSON blob, and then extracting the portion of the JSON blob (via an offset hint) that points to the base64 data, and comparing it against the delegated puzzle.


### `softfork` usage

As explained in the [Backwards Compatibility](#backwards-compatibility) section, starting with block `[todo]`, the operators introduced in this CHIP will be available from inside the `softfork` guard.

The following rules apply when calling the operators from this CHIP using the `softfork` operator:
* The operator works like `apply` (`a`) but takes two additional parameters, `cost` and `extension`
* The syntax is therefore `softfork (<cost> <extension> <quoted-program> <arguments>)`
* The `cost` parameter is the CLVM cost of executing the `quoted-program` with the specified `arguments`. If this cost mismatches the actual cost of executing the program, the `softfork` operator will raise an exception
* The `extension` parameter is an integer indicating the set of extensions available in the `softfork` guard.
* `extension 0` referred to the `Coin ID`, `BLS`, etc operators described in [CHIP-11](https://github.com/Chia-Network/chips/blob/main/CHIPs/chip-0011.md).
* `extension 1` will refer to the operators from this CHIP, _as well as_ those from `extension 0`.
* Just like the `a` operator, the `quoted-program` parameter is quoted and executed from inside the `softfork` guard.
* A client that does not recognize the `extension` specifier must:
* In consensus mode, ignore the whole `softfork`, return `null` and charge the specified `cost`
* In mempool mode, fail the program immediately

Since `softfork` guards always return null, the only useful outcome of executing one is terminating (i.e. failing) the program or not.

The cost of executing the `softfork` operator is 140. This counts against the cost specified in its arguments.

An example of using the `softfork` operator to call the new `base64url_encode` operator is as follows:

```
(softfork
(q . [`todo`]) ; expected cost (including cost of softfork itself)
(q . 1) ; extension 1
(q a ; defer execution of if-branches
(i
(=
(base64url_encode
(q . example)
(q . string)
)
(q . ZXhhbXBsZXN0cmluZw)
)
(q . 0) ; if encoding matches, return 0
(q x) ; if encoding mismatches, raise
)
(q . ())) ; environment to apply
(q . ())) ; environment to softfork
```

## Test Cases

[todo]

## Reference Implementation

See [PR #403](https://github.com/Chia-Network/clvm_rs/pull/403) of the `clvm_rs` GitHub repository for the implementation of this CHIP.

## Security

[todo]

## Additional Assets

None

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
Loading