Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Not recovering on L1 misbehaving #295

Open
xavier-romero opened this issue Feb 6, 2025 · 0 comments
Open

[BUG] Not recovering on L1 misbehaving #295

xavier-romero opened this issue Feb 6, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@xavier-romero
Copy link

Bug Report

Description

CDK-Node unable to sequence after L1 failure.

To Reproduce

https://github.com/0xPolygon/kurtosis-cdk/blob/l1_chaos/docs/mitm.md#l1-missbehaving
https://github.com/0xPolygon/kurtosis-cdk/blob/l1_chaos/scripts/mitm/test_l1_failures.sh

Expected behavior

Recover after L1 resumes normal operation.

Environment (please complete the following information):

cdk-node RC4

Additional context

Doing some testing on stack reliability to L1 failures/issues. Tried many scenarios and most of them result in the same situation: cdk-node (rc4 tested) unable to sequence after a L1 failure until it's restarted (while showing no errors) or even unable to recover in some cases. As I'm using kurtosis/docker it could be that whhat is really required is not to restart but removing the cache file (which is gone after the docker gets restarted).
Scenarios tested with this behavior (all of them tested for 1 minute with a 25% failure ratio -normal L1 operation resumed after the test-):

  • L1 returning HTTP error codes (401, 403, 404, 405, 429, 500, 502, 503, 504)
  • L1 returning no content at all (empty response)
  • L1 returning an empty JSON and/or empty "result" field
  • L1 returning arbitrary HTML or JSON (with the right content-type -which seems to be ignored-)
  • Receiving a corrupted byte on the JSON content (random byte changed)
  • L1 http connection being closed before the answer
  • Wrong L1 endpoint set, for instance, setting L2 url instead L1

To add something positive, everything works fine when L1 answer includes additional(unexpected) JSON fields. 💯

Fully automated testing can be done locally with Kurtosis by executing a single script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant