Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: failure to read complete message #1392

Open
tony-iqlusion opened this issue Feb 12, 2024 · 3 comments · Fixed by #1393
Open

p2p: failure to read complete message #1392

tony-iqlusion opened this issue Feb 12, 2024 · 3 comments · Fixed by #1393
Assignees
Labels
bug Something isn't working p2p Peer-to-peer networking

Comments

@tony-iqlusion
Copy link
Collaborator

What went wrong?

There are ongoing reports of failure to read a complete message in the tendermint-p2p crate which seem to be manifesting as an incomplete Protobuf message in TMKMS:

iqlusioninc/tmkms#729

Steps to reproduce

Unfortunately this seems very chain-specific and I have not been given proper repro steps myself in the aforementioned issue.

Definition of "done"

It's unclear what actually needs to change here.

A somewhat related issue is the API that tendermint-p2p exposes: it's a message-oriented protocol, but the interface uses the std::io::{Read, Write} traits, which are stream-oriented. It's possible TMKMS is misusing the API, but it's the wrong abstraction to begin with.

@tony-iqlusion
Copy link
Collaborator Author

This might be related to #1356, although curiously people claim things work with Unix domain sockets and the errors only occur over TCP.

I did confirm read_exact is used: https://github.com/informalsystems/tendermint-rs/blob/64e9b33/p2p/src/secret_connection.rs#L622

@mzabaluev mzabaluev self-assigned this Feb 16, 2024
@mzabaluev mzabaluev added the p2p Peer-to-peer networking label Feb 16, 2024
@tony-iqlusion
Copy link
Collaborator Author

We're continuing to receive reports of this problem even with tendermint-p2p v0.34.1: iqlusioninc/tmkms#729 (comment)

@tony-iqlusion
Copy link
Collaborator Author

There's now a PR to tmkms to have it aggregate chunks of data itself and keep trying to decode them as a proto: iqlusioninc/tmkms#903

...and it really seems like tendermint-p2p should probably be doing that. Really I think the API should be refactored to be message-oriented, rather than stream-oriented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2p Peer-to-peer networking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants