Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth_getProof implementation does not conform to EIP-1186 #27306

Closed
prestwich opened this issue May 19, 2023 · 10 comments · Fixed by #27309
Closed

eth_getProof implementation does not conform to EIP-1186 #27306

prestwich opened this issue May 19, 2023 · 10 comments · Fixed by #27309
Labels

Comments

@prestwich
Copy link
Contributor

prestwich commented May 19, 2023

Apologies for excessive links, have tried to cite all sources

Description

1186 is inconsistent wrt specification of input to eth_getProof. It specifies the 2nd element of the param array a an array of 32-byte keys, and refers to the eth_getStorageAt call, which is specced in 1474 and on the ethereum website. 1474 specifies the input as a Quantity.

Currently Geth accepts as input all of:

  • 32-byte hex strings (1186 eth_getProof spec)
  • Quantity encoded integers (1474 eth_getStorageAt spec)
  • Non-quantity hex strings of length 2 - 31 bytes (neither spec)

The issue arises because response.storageProof.key is specced as a Quantity, however, geth simply mirrors the input into the response object. When the input is a non-Quantity hex string (e.g. 0x000000005 , the returned response.storageProof[].key is also not minimal. This is outside the spec for Quantity, and is off-spec for 1186

Both 1186 and 1474 are currently status Stagnant, and may not be reliable. Permissive deserialization seems desirable. Providing non-canonical returns to the user seems undesirable

This was previously mentioned in a besu issue as a result of failing hive tests with geth producing non-canonical Quantities in responses. I am unsure if it was ever brought up here. The Besu team opted to change their RPC behavior to always represent this value as an even-length hex string, which is also off-spec.

Requested change is to always Quantity-serialize the key element in the response, rather than simply mirroring the input.

Expected behaviour

eth_getProof response object should Quantity-encode `storageProof.keys, regardless of whether the input was quantity-encoded

Expected output :

{
  "jsonrpc": "2.0",
  "id": 44,
  "result": {
    "address": "0x4976fb03c32e5b8cfe2b6ccb31c09ba78ebaba41",
    "accountProof": [
         ...
    ],
    "balance": "0x0",
    "codeHash": "0x0370d116c18db61a2f90643faa12e3ae22a0b7d518ce868fe3f85e2f0d299fe5",
    "nonce": "0x1",
    "storageHash": "0xd14ec04ecff0cf0846dcb185103fbb656f2612f74cee91a547e03ed037cfff0b",
    "storageProof": [
      {
        "key": "0x5",  // <----- specced encoding
        "value": "0x0",
        "proof": [
            ...
        ]
      }
    ]
  }
}

Steps to reproduce the behaviour

$ curl localhost:8545 \     
    -X POST \                
    -H "Content-Type: application/json" \
    -d '{"method":"eth_getProof", "params":["0x4976fb03c32e5b8cfe2b6ccb31c09ba78ebaba41",["0x000000005"], "latest"],"id":44,"jsonrpc":"2.0"}' \
    | jq

Actual behaviour

see the 0x000000005 storageProof.key`. This is not a Quantity, and is outside of spec for the endpoint

{
  "jsonrpc": "2.0",
  "id": 44,
  "result": {
    "address": "0x4976fb03c32e5b8cfe2b6ccb31c09ba78ebaba41",
    "accountProof": [
         ...
    ],
    "balance": "0x0",
    "codeHash": "0x0370d116c18db61a2f90643faa12e3ae22a0b7d518ce868fe3f85e2f0d299fe5",
    "nonce": "0x1",
    "storageHash": "0xd14ec04ecff0cf0846dcb185103fbb656f2612f74cee91a547e03ed037cfff0b",
    "storageProof": [
      {
        "key": "0x000000005", // <----- Off-spec encoding
        "value": "0x0",
        "proof": [
            ...
        ]
      }
    ]
  }
}
@fjl
Copy link
Contributor

fjl commented May 30, 2023

It is what it is. The API spec at https://github.com/ethereum/execution-apis defines the storage keys as bytes (not QUANTITY). The EIP might be more correct but it's old, slightly inconsistent and can't be changed anymore.

The confusion about this has unfortunately leaked into the spec a little bit, but let's just leave it how it is.

@fjl fjl closed this as completed May 30, 2023
@fjl
Copy link
Contributor

fjl commented May 30, 2023

If there is a specific hive test that should be changed to better reflect the spec's intended meaning, we could do that.

@prestwich
Copy link
Contributor Author

Changing output to bytesMax32 defined as in the execution spec as "32 hex encoded bytes" would also be a good resolution to this issue. The problem arises because the node returns keys that are neither valid Quantity nor valid bytesMax32. Which is to say, whichever spec version we use, the node does not conform to it

@fjl
Copy link
Contributor

fjl commented May 30, 2023

Geth is a bit more lenient than the spec, mostly for historical reasons. So you can, for example, submit a request for storage key 0f1 and it will be treated as 0x00f1. I agree we do mirror the request value into the output, but an invalid output only occurs for slightly invalid input (missing prefix or odd length).

However Geth's behavior should not be used as the example here. Again, if there is a specific RPC test containing the Geth behavior of mirroring an invalid value, we could always change the test.

I don't think it's a good idea to change Geth's handling of input for this request, because that would be an incompatible API change. In our internal discussion about this issue, I was in favor of changing the behavior to mirror the 'sanitized' value into the output, but other people in the team were against it.

@prestwich
Copy link
Contributor Author

To add some context, I've already encountered this with stored JSON proofs in the wild. I'm not sure which RPC client is sending invalid input, but it results in either Geth's incorrect return types propagating through client libraries, or a set of real-world proof JSON response which ethers-rs cannot parse. In addition, I've already pointed out one case above where the poor specification and incorrect behavior caused incorrect changes in another execution client. So the idea that Geth's behavior should not be used as an example is a bit ivory-tower-y. RPC client implementers must handle the behavior, and EL implementers must encounter it during compatibility testing

Okay to recap, (and please correct any of these points if i've misread)

  • Geth produces invalid output for certain inputs
    • This issue arises because the geth node does not strictly validate inputs, and mirrors them to outputs
  • This invald output may or may not be supported by any given RPC client
  • When Geth produces valid outputs they follow this spec
    • Geth does not follow the 1186 spec
  • You would not recommend that other EL client implementors follow Geth's behavior,
    • Instead you would recommend they return validly-encoded bytes
  • Team acknowledges that the output is incorrect for either spec
    • However, no change will be made to Geth's behavior.

Please tag wontfix for future readers 🙏

@fjl fjl reopened this May 30, 2023
@fjl
Copy link
Contributor

fjl commented May 30, 2023

I closed this because we got stuck on the QUANTITY vs bytesMax32 thing in our internal discussion. It just felt like needless bike-shedding honestly.

However, there is a legitimate issue here where geth can create outputs that do not match the execution-apis spec (which is canon!). I think we could fix that in one of two ways:

  • Return the canonicalized value of key in responses. We'd still be accepting slightly invalid inputs, but at least the output would be spec compliant.
  • Make input validation stricter, i.e. make the 0x prefix mandatory.

@prestwich
Copy link
Contributor Author

Yes, I am not married to either quantity or bytes, but I would really appreciate it if the output were canonical for one of those formats

@fjl
Copy link
Contributor

fjl commented May 30, 2023

You would not recommend that other EL client implementors follow Geth's behavior,
Instead you would recommend they return validly-encoded bytes

A spec-compliant server implementation should validate the parameter as bytesMax32, which would then rule out invalid outputs even when the input value is simply copied.

fjl added a commit that referenced this issue Jun 21, 2023
#27309)

This changes the eth_getProof method implementation to re-encode the requested
storage keys, canonicalizing them in the response. For backwards-compatibility reasons,
go-ethereum accepts non-canonical hex keys. Accepting them is fine, but we should
not mirror invalid inputs into the output.

Closes #27306

---------

Co-authored-by: Martin Holst Swende <[email protected]>
Co-authored-by: Felix Lange <[email protected]>
@Rjected
Copy link
Contributor

Rjected commented Jul 24, 2023

The hive test looks like it's fixed (aka does not require the mirrored key string):
ethereum/execution-apis@a4bd432#diff-fc9173279353e40d76a771bb3fb398876d028e15f10824f1fed4272cdfff3e11R2

devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this issue Nov 10, 2023
ethereum#27309)

This changes the eth_getProof method implementation to re-encode the requested
storage keys, canonicalizing them in the response. For backwards-compatibility reasons,
go-ethereum accepts non-canonical hex keys. Accepting them is fine, but we should
not mirror invalid inputs into the output.

Closes ethereum#27306

---------

Co-authored-by: Martin Holst Swende <[email protected]>
Co-authored-by: Felix Lange <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants