[Merged by Bors] - Limit concurrent gethash in getatxs #5442

poszu · 2024-01-15T11:46:11Z

Motivation

Fetcher::GetAtxs() might spawn tens (hundreds) of concurrent get hash requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle.

Changes

use a semaphore as a request limiter to limit the number of concurrent Fetcher::getHash() for ATX sync,
added a pending hash requests gauge metric,
cleaned up unused stuff in fetcher/handler.go

codecov · 2024-01-15T12:00:15Z

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (27d8fab) 77.4% compared to head (1aadc91) 77.6%.
Report is 7 commits behind head on develop.

Files	Patch %	Lines
fetch/mesh_data.go	66.6%	9 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##           develop   #5442     +/-   ##
=========================================
+ Coverage     77.4%   77.6%   +0.1%     
=========================================
  Files          265     266      +1     
  Lines        30889   30955     +66     
=========================================
+ Hits         23936   24025     +89     
+ Misses        5432    5406     -26     
- Partials      1521    1524      +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dshulyak

i think it is more common to use channels in golang for this. perhaps it will be slightly more efficient with semaphore library. one approach would be to create a channel with N items, fill it with that number of tokens, and then control concurrency by blocking on token read.

the small advantage would be that this pattern is selectable, so can be easily interrupted

dshulyak · 2024-01-15T15:15:24Z

fetch/mesh_data.go

 	var eg errgroup.Group
 	var errs error
 	var mu sync.Mutex
-	for _, hash := range hashes {
+	for i, hash := range hashes {
+		if err := options.limiter.Acquire(ctx, 1); err != nil {


looks like this is interruptible as well

Yes Acquire blocks until a slot is available. ctx allows an early cancellation. I would also expect Acquire to return ctx.Err() as its error when the context is done.

fasmat · 2024-01-15T15:47:42Z

i think it is more common to use channels in golang for this. perhaps it will be slightly more efficient with semaphore library. one approach would be to create a channel with N items, fill it with that number of tokens, and then control concurrency by blocking on token read.

I think the sempahore package already existed before context became part of the standard library. I don't mind having a dependency on golang.org/x/, they are essentially an official extension to the standard library and well tested, documented and regularly updated.

It has a nice API and itself only depends on the standard library. It would also allow to weight certain queries more strongly than others (i.e. when certain requests would be more costly than others).

poszu · 2024-01-16T09:56:48Z

bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, ## Test Plan TODO

poszu · 2024-01-16T09:58:41Z

bors cancel

spacemesh-bors · 2024-01-16T09:58:43Z

Canceled.

poszu · 2024-01-16T10:01:31Z

bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

spacemesh-bors · 2024-01-16T10:21:01Z

Build failed:

ci-status

poszu · 2024-01-16T10:43:23Z

Bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

spacemesh-bors · 2024-01-16T11:34:59Z

Build failed:

systest-status

poszu · 2024-01-16T11:42:49Z

bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

spacemesh-bors · 2024-01-16T12:31:41Z

Build failed:

systest-status

poszu · 2024-01-16T12:48:55Z

network errors in grpc streams - retrying

bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

spacemesh-bors · 2024-01-16T13:36:52Z

Build failed:

ci-status

poszu · 2024-01-16T13:44:56Z

bors merge

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

spacemesh-bors · 2024-01-16T15:24:09Z

Pull request successfully merged into develop.

Build succeeded:

## Motivation `Fetcher::GetAtxs()` might spawn tens (hundreds) of concurrent _get hash_ requests and all responses will be queued up in the ATX validator callback. There should be some backpressure to avoid querying more ATXs at one time than we can reasonably handle. ## Changes - use a semaphore as a request limiter to limit the number of concurrent `Fetcher::getHash()` for ATX sync, - added a _pending hash requests_ gauge metric, - cleaned up unused stuff in fetcher/handler.go

poszu added 3 commits January 15, 2024 11:46

Remove unused stuff from Fetcher

ee08a69

Limit concurrent calls to getHash() in GetAtxs()

1576d1b

Satisfy linter

8c9f131

Limit requests for ATXs globally

859c25d

poszu marked this pull request as ready for review January 15, 2024 15:07

poszu requested review from dshulyak, fasmat and ivan4th as code owners January 15, 2024 15:07

dshulyak approved these changes Jan 15, 2024

View reviewed changes

dshulyak reviewed Jan 15, 2024

View reviewed changes

Fix decreasing pending reqs gauge

932ac82

fasmat approved these changes Jan 15, 2024

View reviewed changes

poszu added 3 commits January 16, 2024 10:26

UT for limiting concurrent requests for ATXs

1403931

Fix UTs

805309f

Satisfy linter

1aadc91

spacemesh-bors bot changed the title ~~Limit concurrent gethash in getatxs~~ [Merged by Bors] - Limit concurrent gethash in getatxs Jan 16, 2024

spacemesh-bors bot closed this Jan 16, 2024

spacemesh-bors bot deleted the limit-concurrent-gethash-in-getatxs branch January 16, 2024 15:24

poszu mentioned this pull request Jan 16, 2024

[backport] Limit concurrent requests for ATXs #5446

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Limit concurrent gethash in getatxs #5442

[Merged by Bors] - Limit concurrent gethash in getatxs #5442

poszu commented Jan 15, 2024 •

edited

Loading

codecov bot commented Jan 15, 2024 •

edited

Loading

dshulyak left a comment

dshulyak Jan 15, 2024

fasmat Jan 15, 2024

fasmat commented Jan 15, 2024 •

edited

Loading

poszu commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

[Merged by Bors] - Limit concurrent gethash in getatxs #5442

[Merged by Bors] - Limit concurrent gethash in getatxs #5442

Conversation

poszu commented Jan 15, 2024 • edited Loading

Motivation

Changes

codecov bot commented Jan 15, 2024 • edited Loading

Codecov Report

dshulyak left a comment

Choose a reason for hiding this comment

dshulyak Jan 15, 2024

Choose a reason for hiding this comment

fasmat Jan 15, 2024

Choose a reason for hiding this comment

fasmat commented Jan 15, 2024 • edited Loading

poszu commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 16, 2024

spacemesh-bors bot commented Jan 16, 2024

poszu commented Jan 15, 2024 •

edited

Loading

codecov bot commented Jan 15, 2024 •

edited

Loading

fasmat commented Jan 15, 2024 •

edited

Loading