Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Limit concurrent gethash in getatxs #5442

Closed
wants to merge 8 commits into from

Conversation

poszu
Copy link
Contributor

@poszu poszu commented Jan 15, 2024

Motivation

Fetcher::GetAtxs() might spawn tens (hundreds) of concurrent get hash requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

Changes

  • use a semaphore as a request limiter to limit the number of concurrent Fetcher::getHash() for ATX sync,
  • added a pending hash requests gauge metric,
  • cleaned up unused stuff in fetcher/handler.go

Copy link

codecov bot commented Jan 15, 2024

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (27d8fab) 77.4% compared to head (1aadc91) 77.6%.
Report is 7 commits behind head on develop.

Files Patch % Lines
fetch/mesh_data.go 66.6% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           develop   #5442     +/-   ##
=========================================
+ Coverage     77.4%   77.6%   +0.1%     
=========================================
  Files          265     266      +1     
  Lines        30889   30955     +66     
=========================================
+ Hits         23936   24025     +89     
+ Misses        5432    5406     -26     
- Partials      1521    1524      +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@poszu poszu marked this pull request as ready for review January 15, 2024 15:07
Copy link
Contributor

@dshulyak dshulyak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it is more common to use channels in golang for this. perhaps it will be slightly more efficient with semaphore library. one approach would be to create a channel with N items, fill it with that number of tokens, and then control concurrency by blocking on token read.

the small advantage would be that this pattern is selectable, so can be easily interrupted

var eg errgroup.Group
var errs error
var mu sync.Mutex
for _, hash := range hashes {
for i, hash := range hashes {
if err := options.limiter.Acquire(ctx, 1); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is interruptible as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Acquire blocks until a slot is available. ctx allows an early cancellation. I would also expect Acquire to return ctx.Err() as its error when the context is done.

@fasmat
Copy link
Member

fasmat commented Jan 15, 2024

i think it is more common to use channels in golang for this. perhaps it will be slightly more efficient with semaphore library. one approach would be to create a channel with N items, fill it with that number of tokens, and then control concurrency by blocking on token read.

I think the sempahore package already existed before context became part of the standard library. I don't mind having a dependency on golang.org/x/, they are essentially an official extension to the standard library and well tested, documented and regularly updated.

It has a nice API and itself only depends on the standard library. It would also allow to weight certain queries more strongly than others (i.e. when certain requests would be more costly than others).

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,

## Test Plan
TODO
@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

bors cancel

@spacemesh-bors
Copy link

Canceled.

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
@spacemesh-bors
Copy link

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

Bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
@spacemesh-bors
Copy link

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
@spacemesh-bors
Copy link

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

network errors in grpc streams - retrying

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
@spacemesh-bors
Copy link

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Jan 16, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
@spacemesh-bors
Copy link

Pull request successfully merged into develop.

Build succeeded:

@spacemesh-bors spacemesh-bors bot changed the title Limit concurrent gethash in getatxs [Merged by Bors] - Limit concurrent gethash in getatxs Jan 16, 2024
@spacemesh-bors spacemesh-bors bot closed this Jan 16, 2024
@spacemesh-bors spacemesh-bors bot deleted the limit-concurrent-gethash-in-getatxs branch January 16, 2024 15:24
poszu added a commit that referenced this pull request Jan 16, 2024
## Motivation
`Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

## Changes
- use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync,
- added a _pending hash requests_ gauge metric,
- cleaned up unused stuff in fetcher/handler.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants