Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: bitswap lock contention under high load #817

Merged
merged 22 commits into from
Jan 31, 2025
Merged

fix: bitswap lock contention under high load #817

merged 22 commits into from
Jan 31, 2025

Conversation

gammazero
Copy link
Contributor

@gammazero gammazero commented Jan 27, 2025

Summary

Fix runaway goroutine creation under high load. Under high load conditions, goroutines are created faster than they can complete and the more goroutines creates the slower them complete. This creates a positive feedback cycle that ends in OOM. The fix dynamically adjusts message send scheduling to avoid the runaway condition.

Description of Lock Contention under High Load

The peermanager acquires the peermanager mutex, does peermanager stuff, and then acquires the messagequeue mutex for each peer to put wants/cancels on that peer's message queue. Nothing is blocked indefinitely, but all session goroutines wait on the peermanager mutex.

The messagequeue event loop for each peer is always running in a separate goroutine, waking up every time new data is added to the message queue. The messagequeue acquires the messagequeue mutex to check the amount of pending work and send a message if there is enough work.

The frequent lock/unlock of each messagequeue mutex delays each session goroutine from adding items to messagequeues, as they wait to acquire each peer's messagequeue mutex to enqueue a message. These delays cause the peermanager mutex to be held longer by each goroutine. When there are a sufficient number of peers and want requests, goroutines end up waiting on the peermanager mutex for a longer time, on average, that it takes for an additional request to arrive and start another goroutine. This leads to a positive feedback loop where the number of goroutines increases until their number alone is sufficient to cause OOM.

How this PR Fixes this

This PR avoids waking up the messagequeue event loop on every item added to the message queue, thus avoiding the high-frequency messagequeue mutex lock/unlock. Instead, the event loop wakes up after a delay, sends the accumulated work, then goes back to sleep for another delay. During the delay, wants and cancels are accumulated. This allows the session goroutines to add items to message queues without contending with the messagequeue event loop for the messagequeue mtuex.

The delay dynamically adjusts, between 20ms and 1 second, based on the number peers. The delay per peer is configurable, with a default of 1/8 millisecond (125us).

Copy link

codecov bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 85.18519% with 12 lines in your changes missing coverage. Please review.

Project coverage is 60.48%. Comparing base (62512fb) to head (87ad4a5).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...tswap/client/internal/messagequeue/messagequeue.go 87.50% 7 Missing and 1 partial ⚠️
bitswap/client/client.go 42.85% 4 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #817      +/-   ##
==========================================
- Coverage   60.49%   60.48%   -0.01%     
==========================================
  Files         244      244              
  Lines       31079    31100      +21     
==========================================
+ Hits        18800    18810      +10     
- Misses      10603    10615      +12     
+ Partials     1676     1675       -1     
Files with missing lines Coverage Δ
bitswap/message/message.go 84.42% <100.00%> (+0.55%) ⬆️
bitswap/client/client.go 87.58% <42.85%> (+0.96%) ⬆️
...tswap/client/internal/messagequeue/messagequeue.go 85.42% <87.50%> (-0.06%) ⬇️

... and 14 files with indirect coverage changes

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, staging box looks really promising, the window of time when this was deployed to box 02 was significantly in better shape than kubo 0.32.1 (01 box):

image
image

HTTP success rate is higher too:

image

EOD for me, but I'll do more test tomorrow morning and see if any questions arise. Some quick ones inline.

@lidel lidel changed the title Bs experiment [WIP] fix: bitswap lock contention under high load Jan 29, 2025
@gammazero gammazero changed the title [WIP] fix: bitswap lock contention under high load fix: bitswap lock contention under high load Jan 30, 2025
@gammazero gammazero marked this pull request as ready for review January 30, 2025 19:40
@gammazero gammazero requested a review from a team as a code owner January 30, 2025 19:40
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, this is such improvement we should ship this as a patch release next week.

For posterity: based on our (Shipyard) staging tests the impact on high load providers is significant.

Below is a sample from HTTP gateway processing ~80 requests per second (mirrored organic cache-miss from ipfs.io). "01" is latest Kubo (0.33.0) without this fix, and "02" is with this fix (0.33.1):

kubo-with-boxo-goroutine-fix
kubo-rsp-rate-bs-fix

bitswap/client/client.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants