Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(engine api): [3/3] Engine API retryWithTimeout #2490

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

shotes
Copy link
Contributor

@shotes shotes commented Feb 7, 2025

This PR introduces retries to engine_newPayload and engine_forkchoiceUpdated calls.

Additionally, remove one of the return values from NotifyForkchoiceUpdate that was never used.

This PR introduces a third party go port of google-http-client ExponentialBackOff.java. I have read through all of the code that the third party package implements. Please verify that package as well.

@shotes shotes requested a review from a team as a code owner February 7, 2025 00:08
@shotes shotes changed the base branch from main to deposit-sync-force-startup-fcu February 7, 2025 00:08
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 0% with 151 lines in your changes missing coverage. Please review.

Project coverage is 32.65%. Comparing base (f09c436) to head (62b5838).

Files with missing lines Patch % Lines
execution/engine/engine.go 0.00% 147 Missing ⚠️
payload/builder/payload.go 0.00% 2 Missing ⚠️
beacon/blockchain/execution_engine.go 0.00% 1 Missing ⚠️
beacon/blockchain/payload.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2490      +/-   ##
==========================================
- Coverage   32.77%   32.65%   -0.12%     
==========================================
  Files         351      351              
  Lines       15857    15915      +58     
  Branches       20       20              
==========================================
  Hits         5197     5197              
- Misses      10276    10334      +58     
  Partials      384      384              
Files with missing lines Coverage Δ
beacon/blockchain/execution_engine.go 0.00% <0.00%> (ø)
beacon/blockchain/payload.go 0.00% <0.00%> (ø)
payload/builder/payload.go 20.80% <0.00%> (ø)
execution/engine/engine.go 0.00% <0.00%> (ø)

Copy link
Contributor

@fridrik01 fridrik01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shotes I checked out this branch (to check whether it resolved some syncing issue I had locally) and I observed that now beacond hangs when it receives SIGTERM if its currently inside a retry. It requires to either wait up to 5 minutes or send it a SIGKILL to exit. This behavior was not in deposit-sync-force-startup-fcu branch.

EDIT: I think the main reason is that the context passed to NotifyForkchoiceUpdate is not properly cancelled. Let me know if you want me to take a look at that, I can push a separate PR on top of this one, should have time this weekend.

@shotes
Copy link
Contributor Author

shotes commented Feb 7, 2025

@shotes I checked out this branch (to check whether it resolved some syncing issue I had locally) and I observed that now beacond hangs when it receives SIGTERM if its currently inside a retry. It requires to either wait up to 5 minutes or send it a SIGKILL to exit. This behavior was not in deposit-sync-force-startup-fcu branch.

EDIT: I think the main reason is that the context passed to NotifyForkchoiceUpdate is not properly cancelled. Let me know if you want me to take a look at that, I can push a separate PR on top of this one, should have time this weekend.

@fridrik01 I am seeing this issue as well, having to specifically kill the process each time I restart. Just pushed up changes that uses a different lib, still seeing the same issue. Investigating now - feel free to test and push any fixes.

@fridrik01
Copy link
Contributor

@shotes I checked out this branch (to check whether it resolved some syncing issue I had locally) and I observed that now beacond hangs when it receives SIGTERM if its currently inside a retry. It requires to either wait up to 5 minutes or send it a SIGKILL to exit. This behavior was not in deposit-sync-force-startup-fcu branch.
EDIT: I think the main reason is that the context passed to NotifyForkchoiceUpdate is not properly cancelled. Let me know if you want me to take a look at that, I can push a separate PR on top of this one, should have time this weekend.

@fridrik01 I am seeing this issue as well, having to specifically kill the process each time I restart. Just pushed up changes that uses a different lib, still seeing the same issue. Investigating now - feel free to test and push any fixes.

@shotes I pushed #2496 which addresses hanging on SIGTERM, however it unwraps an issue it seems in cometbft where it panics on a cancelled context. Need to figure out what is going on there, maybe cometbft does not support ctx cancellation

Base automatically changed from deposit-sync-force-startup-fcu to main February 19, 2025 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants