Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

501 fix occassional horizonresponse decodeerror #502

Merged
merged 16 commits into from
Apr 8, 2024

Conversation

b-yap
Copy link
Contributor

@b-yap b-yap commented Mar 29, 2024

closes #501

The error found was 403, and as explained by @ebma, it should be recoverable.
Not only that, I've updated the HorizonResponseError structure, flexible enough to accept either a ReqwestError or an error in string format.

The string format is important to find the correct error.

@b-yap b-yap marked this pull request as ready for review April 4, 2024 09:38
@b-yap b-yap requested a review from a team April 4, 2024 09:38
Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 I'm curious if this fixes the problem, hard to tell. @b-yap did you happen to test this locally? Encountering these timeouts is probably hard to reproduce locally, as you would need to simulate some redeem requests.

clients/wallet/src/horizon/responses.rs Outdated Show resolved Hide resolved
clients/wallet/src/error.rs Outdated Show resolved Hide resolved
@b-yap
Copy link
Contributor Author

b-yap commented Apr 4, 2024

@ebma by flagging this error as "recoverable", it will resend the same transaction again.
see https://github.com/pendulum-chain/spacewalk/blob/501-fix-occassional-horizonresponse-decodeerror/clients/wallet/src/horizon/horizon.rs#L167.

@ebma
Copy link
Member

ebma commented Apr 4, 2024

Yes, the logic makes sense to me. I was asking if you tested it because it would actually be interesting to see how this behaves in practice. Because if this error is indeed somehow related to rate limiting, our logic here is not enough to make the transaction eventually go through. Since even if we retry with the same transaction over and over again, the rate-limiting would continue to block it. If this is the case, which I don't know, then we should add some kind of delay here and wait for the rate-limiting window to expire.

@b-yap
Copy link
Contributor Author

b-yap commented Apr 5, 2024

Yesterday it was difficult with the accounts not existing.

There is already a delay added on the loop, the backoff_delay_counter.

let sleep_duration = backoff_delay_counter * BASE_BACKOFF_DELAY_IN_SECS;
sleep(Duration::from_secs(sleep_duration)).await;
// retry/resubmit again
if sleep_duration < u64::from(max_backoff_delay_in_secs) {
backoff_delay_counter += 1;
}

Will this slowdown the CI? Possible.

@ebma
Copy link
Member

ebma commented Apr 5, 2024

Ahhh true, we already have that delay in place 👍 That's great.

I found out that the Horizon's apparently have IP-based rate limiting of 3600 requests per hour ie. one request per second on average.

Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give these changes a try in production. In any case they won't do any harm and at least improve the decoding of some horizon responses, even if we are not 100% sure whether submissions that returned a 403 error will eventually get accepted or not.

@ebma
Copy link
Member

ebma commented Apr 8, 2024

Only one integration test failed so we can consider it a shaky test. I checked cargo clippy and rustfmt locally and there are no complaints. Merging.

@ebma ebma merged commit bed4453 into main Apr 8, 2024
1 check failed
@ebma ebma deleted the 501-fix-occassional-horizonresponse-decodeerror branch April 8, 2024 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix occasional 'HorizonResponse: DecodeError'
2 participants