Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate invalid tx validation panics #124

Closed
PopcornPaws opened this issue Mar 27, 2023 · 4 comments
Closed

Investigate invalid tx validation panics #124

PopcornPaws opened this issue Mar 27, 2023 · 4 comments
Assignees
Labels
bug something isn't working as intended

Comments

@PopcornPaws
Copy link
Contributor

Description

Upon submitting an invalid transaction to the pool, the tx pool api panics upon validation with

Bad input data provided to validate_transaction: Invalid transaction version

It seems that this is a completely normal phenomenon
image

Nevertheless, since all nodes were compiled with

[profile.release]
panic = "abort"

any node that receives an invalid transaction will crash.

Solution

Should check whether it's possible to solve this via a runtime upgrade. However, this seems to be so deeply ingrained (behind substrate macros) in the code that it probably cannot be changed without forking and modifying some substrate dependencies. The other option is to recompile all nodes with panic = "unwind" and restart the chain.

@PopcornPaws PopcornPaws added the bug something isn't working as intended label Mar 27, 2023
@PopcornPaws PopcornPaws self-assigned this Mar 27, 2023
@PopcornPaws
Copy link
Contributor Author

The culprit is line 91 in primitives/api/proc-macro/src/impl_runtime_apis.rs where there is an explicit panic for cases where some input cannot be properly decoded.

This is within a macro and I guess there's no other way for the code to emit a helpful error message when this fails. After expanding the macro, it is seen that these panics are in a function called dispatch which returns an Option<Vec<u8>>, not a Result. We could add a println macro or log::error!(...) but that doesn't provide a stack trace for debugging. But the biggest problem is that I don't see a way this would be changed in parity's version, thus we would be stuck with our substrate fork.

Thus, I think every node operator needs to recompile their node with panic = "unwind". But at that point, we might consider restarting the whole chain from the current main branch with all runtime upgrades merged. Let me know what do you think @OnyxSkyscape

@PopcornPaws
Copy link
Contributor Author

Some more proof that the explicit panic is here to stay:

  • directly from Polkadot Cargo.toml
[profile.release]
# Substrate runtime requires unwinding.
panic = "unwind"
  • substrate-parachain-template/Cargo.toml also has panic = "unwind"
  • substrate-node-template/Cargo.toml also has panic = "unwind"

@PopcornPaws
Copy link
Contributor Author

The panic flag was changed from unwind to abort in #24 because we were looking for ways to decrease the compilation time. I think it was based on this substrate issue.

According to the book using panic = "abort" may speed up compilation because less code is generated (for handling thread cleanup during unwinding). Unfortunately, the significance of this flag was not clear to us back then and it was left in the code.

@PopcornPaws
Copy link
Contributor Author

In #123 , runtime version 103 was built with panic = "unwind", and the node no longer crashes upon receiving an invalid transaction. This was tested on elpis successfully as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working as intended
Projects
None yet
Development

No branches or pull requests

1 participant