Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account sequence errors from connecting to partially synced chains #1462

Closed
5 tasks
Tracked by #1397
faddat opened this issue Oct 17, 2021 · 6 comments
Closed
5 tasks
Tracked by #1397

Account sequence errors from connecting to partially synced chains #1462

faddat opened this issue Oct 17, 2021 · 6 comments
Labels
A: blocked Admin: blocked by another (internal/external) issue or PR A: bug Admin: something isn't working I: dependencies Internal: related to dependencies I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product
Milestone

Comments

@faddat
Copy link
Contributor

faddat commented Oct 17, 2021

Crate

relayer-cli

Summary of Bug

If hermes is started while a chain is syncing, it picks up old acccount sequence values, and then attempts to use them repetitively.

The temporary solution is "don't start hermes against a partially synced chain"

The permanent solution is probably to actively check the account sequence

Version

0.7.3 + the compatiblity pr #1461

Steps to Reproduce

Heheh, little bit of a hard one here, but I guess it's approximately:

Using a wallet that has already been used for relaying, start Hermes against a partially synced Juno or Sifchain. What is likely to occur here is that Hermes will keep trying to make a tx with an invalid sequence number.

Another solution to this may be to disable hermes from interacting with partially-synced chains.

Acceptance Criteria

It should no longer be possible for Account sequence issues to crop up because it was connected to partially-synced chains from a previously used wallet.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@faddat faddat changed the title Account sequence errors Account sequence errors from connecting to partially synced chains Oct 17, 2021
@adizere
Copy link
Member

adizere commented Oct 18, 2021

Hey Jacob,

This is a tough one! In general, Hermes cannot trust account sequence numbers from a full node unless that node (a) is synced with the network and (b) has a clean mempool (ref). For (b) we don't have a solution yet, but we're working with tm-go team to find one. The problem you're highlighting here is in reference to case (a).

With respect to (a) it's not clear how can Hermes reliably infer whether the node is or isn't synced with the network. For SDK-based chains there is the option to query the node's status endpoint and read the catching_up = true field; if false then the node is clearly not synced; if true then it's actually not clear what the status is, because the node may be out of sync and be unaware (the true flag is unreliable).

Beside the catching_up flag, is there any API that Hermes can use to infer if the full node is synced or not? Any insights appreciated!

@adizere adizere added A: bug Admin: something isn't working I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product labels Oct 18, 2021
@adizere adizere added this to the 11.2021 milestone Oct 18, 2021
@MasterPi-2124
Copy link

Hi @adizere, thank you for your comment! We checked our chains, in this case, Juno, and it's unsynced with catching_up=false, but other chains are unsynced too, and they are still healthy.

@mircea-c
Copy link

catching_up is not really reliable for chain sync. You can have a chain that shows catching_up=false but it's really still behind the head of the chain. I've experienced this with multiple chains.

The only way, currently, to accurately gauge this is to compare block heights from multiple nodes.

@adizere
Copy link
Member

adizere commented Oct 20, 2021

The only way, currently, to accurately gauge this is to compare block heights from multiple nodes.

Good to know!!

@adizere adizere added A: blocked Admin: blocked by another (internal/external) issue or PR I: dependencies Internal: related to dependencies labels Oct 20, 2021
@adizere
Copy link
Member

adizere commented Oct 20, 2021

Marking as blocked until we get more insights. We'll perhaps arrive at a solution together with the tendermint-go team!

@adizere
Copy link
Member

adizere commented Dec 21, 2021

This problem has been partly solved in PR #1349, which closed a related issue #1264.

Another solution to this may be to disable hermes from interacting with partially-synced chains.
Currently Hermes either

It should no longer be possible for Account sequence issues to crop up because it was connected to partially-synced chains from a previously used wallet.

Currently, Hermes either:

  • detects that a chain is out of sync via the health checkup (namely, the parameter catching_up), and in that case it emits a clear warning that some functionality is impacted.
  • does not detect the out of sync problem, but encounters an account sequence mismatch error and has an built-in mechanism to recover (by retrying) from that, thanks to the work in Retry send_tx on account sequence mismatch #1349.
    • operators no longer need to restart Hermes, but they may need to restart their full node if Hermes is unable to make progress due to the account sequence mismatch error.

I will therefore close this issue, and we may consider reopening if we still see problem in the future.

@adizere adizere closed this as completed Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: blocked Admin: blocked by another (internal/external) issue or PR A: bug Admin: something isn't working I: dependencies Internal: related to dependencies I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product
Projects
None yet
Development

No branches or pull requests

4 participants