Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One send packet stuck on Osmosis channel channel-0 to Cosmos-Hub #2993

Closed
6 tasks
Tracked by #3005
adizere opened this issue Jan 9, 2023 · 5 comments
Closed
6 tasks
Tracked by #3005

One send packet stuck on Osmosis channel channel-0 to Cosmos-Hub #2993

adizere opened this issue Jan 9, 2023 · 5 comments
Assignees
Labels
A: bug Admin: something isn't working E: osmosis External: related to Osmosis I: rpc Internal: related to (g)RPC O: reliability Objective: cause to improve trustworthiness and consistent performing

Comments

@adizere
Copy link
Member

adizere commented Jan 9, 2023

Summary of Bug

@faddat discovered this SendPacket stuck last week.

Version

v1.2

Steps to Reproduce

  • Using clear packets, Hermes is not able to pull the packet data for this packet.
hermes clear packets --chain osmosis-1 --port transfer --channel channel-0
2023-01-09T11:24:50.964255Z  INFO ThreadId(01) using default configuration from '/Users/faddat/.hermes/config.toml'
2023-01-09T11:24:50.964627Z  INFO ThreadId(01) running Hermes v1.2.0+00fe26ed
2023-01-09T11:25:04.664258Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}: 2 unreceived packets found: [ 1275507, 1642799 ]
2023-01-09T11:25:04.664347Z  WARN ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:query_send_packet_events{chain=osmosis-1 height=<=1-7722248 sequences=[Sequence(1275507), Sequence(1642799)]}: will do QueryPacketEventDataRequest from src_chain
2023-01-09T11:25:04.664421Z  WARN ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:query_send_packet_events{chain=osmosis-1 height=<=1-7722248 sequences=[Sequence(1275507), Sequence(1642799)]}: query_packets_from_txs
2023-01-09T11:25:04.664458Z  WARN ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:query_send_packet_events{chain=osmosis-1 height=<=1-7722248 sequences=[Sequence(1275507), Sequence(1642799)]}: /tx_search for request QueryPacketEventDataRequest { event_id: SendPacket, source_channel_id: ChannelId("channel-0"), source_port_id: PortId("transfer"), destination_channel_id: ChannelId("channel-141"), destination_port_id: PortId("transfer"), sequences: [Sequence(1275507), Sequence(1642799)], height: SmallerEqual(Specific(Height { revision: 1, height: 7722248 })) }
2023-01-09T11:25:18.475936Z  WARN ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}: encountered query failure while pulling packet data: RPC error to endpoint https://rpc-osmosis-archive-ia.cosmosia.notional.ventures/: serde parse error: EOF while parsing a string at line 188 column 524468
SUCCESS []

The main part is serde parse error: EOF while parsing a string at line 188 column 524468.

  • Jacob also discovered, interestingly, that using tx packet-recv (instead of clear packets) gets Hermes more progress
2023-01-09T14:30:00.924712Z TRACE ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:query_send_packet_events{chain=osmosis-1 height=<=1-7724204 sequences=[Sequence(1275507)]}: end_block_events []
2023-01-09T14:30:00.926956Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}: pulled packet data for 1 events out of 1 sequences: 1275507..=1275507; events.total=1 events.left=0
2023-01-09T14:30:01.395287Z TRACE ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:generate_operational_data{tracking_id=packet-recv}: processing event event=SendPacket(SendPacket { packet: seq:1275507, path:channel-0/transfer->channel-141/transfer, toh:no timeout, tos:Timestamp(2022-09-08T22:35:35Z)) }) at height 1-7724204
2023-01-09T14:30:03.331731Z TRACE ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:generate_operational_data{tracking_id=packet-recv}: build timeout for channel packet=seq:1275507, path:channel-0/transfer->channel-141/transfer, toh:no timeout, tos:Timestamp(2022-09-08T22:35:35Z)) height=4-13599968
2023-01-09T14:30:03.879642Z TRACE ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:generate_operational_data{tracking_id=packet-recv}: built timeout msg packet=seq:1275507, path:channel-0/transfer->channel-141/transfer, toh:no timeout, tos:Timestamp(2022-09-08T22:35:35Z)) height=4-13599969
2023-01-09T14:30:03.880406Z TRACE ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:generate_operational_data{tracking_id=packet-recv}: collected event msg.type_url=/ibc.core.channel.v1.MsgTimeout event=SendPacket(SendPacket { packet: seq:1275507, path:channel-0/transfer->channel-141/transfer, toh:no timeout, tos:Timestamp(2022-09-08T22:35:35Z)) }) at height 1-7724204
2023-01-09T14:30:03.880676Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:schedule{odata=packet-recv ->Source @4-13599968; len=1}: connection delay need not be taken into account: client update message will be prepended later
2023-01-09T14:30:03.880738Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}: retrying retry.current=1 retry.max=5
2023-01-09T14:30:03.880996Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}: prepending Source client update at height 4-13599969
2023-01-09T14:30:06.736708Z TRACE ThreadId(21) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:foreign_client.wait_and_build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}:foreign_client.build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}: light client verification trusted=4-13599963 target=4-13599969
2023-01-09T14:30:21.291374Z TRACE ThreadId(21) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:foreign_client.wait_and_build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}:foreign_client.build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}: adjusting headers with 0 supporting headers trusted=4-13599963 target=13599969
2023-01-09T14:30:21.291431Z TRACE ThreadId(21) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:foreign_client.wait_and_build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}:foreign_client.build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}: fetching header height=4-13599964
2023-01-09T14:30:28.609044Z DEBUG ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:foreign_client.wait_and_build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}:foreign_client.build_update_client_with_trusted{client=cosmoshub-4->osmosis-1:07-tendermint-1 target_height=4-13599969}: building a MsgUpdateAnyClient from trusted height 4-13599963 to target height 4-13599969
2023-01-09T14:30:28.611898Z  INFO ThreadId(01) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}: assembled batch of 2 message(s)
2023-01-09T14:30:29.582360Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}: sending 2 messages as 2 batches to chain osmosis-1 in parallel
2023-01-09T14:30:29.582596Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=8}: max fee, for use in tx simulation: Fee { amount: "7501uosmo", gas_limit: 3000000 }
2023-01-09T14:30:31.904180Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=8}:estimate_gas: tx simulation successful, gas amount used: 520049
2023-01-09T14:30:31.904324Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=8}: send_tx: using 520049 gas, fee Fee { amount: "1431uosmo", gas_limit: 572053 } id=osmosis-1
2023-01-09T14:30:34.297676Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=8}: gas estimation succeeded
2023-01-09T14:30:34.297749Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=8}: tx was successfully broadcasted, increasing account sequence number response=Response { code: Ok, data: b"", log: "[]", hash: Hash::Sha256(6D81E1214BA130B3B232AAD8C6F6ED1B551B46E6F5A42FF5ABF4E6A6BF91AC75) } account.sequence.old=8 account.sequence.new=9
2023-01-09T14:30:34.299232Z DEBUG ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=9}: max fee, for use in tx simulation: Fee { amount: "7501uosmo", gas_limit: 3000000 }
2023-01-09T14:30:39.801049Z ERROR ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=9}:estimate_gas: failed to simulate tx. propagating error to caller: gRPC call failed with status: status: Internal, message: "protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 413 Payload Too Large", details: [], metadata: MetadataMap { headers: {"server": "nginx/1.22.0", "date": "Mon, 09 Jan 2023 14:30:39 GMT", "content-type": "text/html", "content-length": "183"} }
2023-01-09T14:30:39.801182Z ERROR ThreadId(11) relay_recv_packet_and_timeout_messages{src_chain=osmosis-1 src_port=transfer src_channel=channel-0 dst_chain=cosmoshub-4}:relay{odata=packet-recv ->Source @4-13599968; len=1}:send_messages_and_wait_commit{chain=osmosis-1 tracking_id=packet-recv}:send_tx_with_account_sequence_retry{chain=osmosis-1 account.sequence=9}: gas estimation failed or encountered another unrecoverable error error=gRPC call failed with status: status: Internal, message: "protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 413 Payload Too Large", details: [], metadata: MetadataMap { headers: {"server": "nginx/1.22.0", "date": "Mon, 09 Jan 2023 14:30:39 GMT", "content-type": "text/html", "content-length": "183"} }
ERROR link error: link failed with underlying error: gRPC call failed with status: status: Internal, message: "protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 413 Payload Too Large", details: [], metadata: MetadataMap { headers: {"server": "nginx/1.22.0", "date": "Mon, 09 Jan 2023 14:30:39 GMT", "content-type": "text/html", "content-length": "183"} }

The main part here is HTTP error 413, likely because of nginx frontend to Notional RPC nodes.

Acceptance Criteria

  • packet sequence number 1,275,507 relayed

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@adizere adizere added A: bug Admin: something isn't working A: blocked Admin: blocked by another (internal/external) issue or PR I: rpc Internal: related to (g)RPC E: osmosis External: related to Osmosis labels Jan 9, 2023
@adizere adizere self-assigned this Jan 9, 2023
@github-project-automation github-project-automation bot moved this to 🩹 Triage in Hermes Jan 9, 2023
@seanchen1991 seanchen1991 moved this from 🩹 Triage to 🏗 In progress in Hermes Jan 9, 2023
@adizere
Copy link
Member Author

adizere commented Jan 9, 2023

Working on this is blocked on an available node that has the packet data for the stuck packet. Right now, we're using a Notional archive node and working in tandem with Jacob to debug.

@adizere adizere added the O: reliability Objective: cause to improve trustworthiness and consistent performing label Jan 9, 2023
@doggystylez
Copy link

cleared with hermes 1.1.0 and a whole lot of gas

@romac
Copy link
Member

romac commented Jan 10, 2023

@doggystylez Sweet! Did it just go through with clear packets or were there additional steps to be taken?

@adizere adizere moved this from 🏗 In progress to ✅ Done in Hermes Jan 10, 2023
@adizere adizere moved this from ✅ Done to 🏗 In progress in Hermes Jan 10, 2023
@adizere adizere closed this as completed Jan 10, 2023
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Hermes Jan 10, 2023
@adizere
Copy link
Member Author

adizere commented Jan 10, 2023

Way to go @doggystylez!

We'll coordinate with the IBC-go team & SDK to see where's the best place for implementing preventive measures. A proof-of-concept solution for preventing these kind of corner-cases in the future exists here: cosmos/ibc-go#2983.

@adizere adizere removed the A: blocked Admin: blocked by another (internal/external) issue or PR label Jan 10, 2023
@doggystylez
Copy link

@romac clear packets worked

what I did:

  • used a remote "naked" rpc endpoint (not being proxied)
  • used a local grpc endpoint

I noted that the osmosis app.toml has a setting for grpc max-recv-msg-size (which conveniently defaults to 10MB in the latest version), going to check this out further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: bug Admin: something isn't working E: osmosis External: related to Osmosis I: rpc Internal: related to (g)RPC O: reliability Objective: cause to improve trustworthiness and consistent performing
Projects
Status: ✅ Done
Development

No branches or pull requests

3 participants