-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP PubSub Input Performance #35029
Comments
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
Closed #37657 which was aimed at creating multiple pubsub clients rather than beat pipeline clients. Having multiple beat pipeline clients helps in reducing lock contention, but as seen in the attached mutex profiles in the PR, multiple pubsub clients doesn't really reduce it. The solution we need is similar to AWS S3's SQS event processor: beats/x-pack/filebeat/input/awss3/sqs_s3_event.go Lines 310 to 311 in 5b24b7d
Here, the S3 input creates 1 pipeline client for each SQS message to process all S3 events within that SQS message. |
Hey @andrewkroh, I tried 2 variations which we discussed:
As pet the results below, there is not any performance improvement observed in both variations. Both variations are taken from this base commit which is close to Variation 1: adding pipeline clients inside an arrayv8.14.0...kcreddy:beats:variation1-array: 2 files changed:
Results:
Variation 2: using sync.Pool.v8.14.0...kcreddy:beats:variation2-syncpool: 1 file changed:
Results:
|
The contention for the lock within the beat.Client If there is little edge processing then optimizing to avoid contention on the publish lock probably isn't worthwhile. The sync.Pool experiment is showing that. And if we did pursue this we would need to use something other than sync.Pool to provide a pool of clients because we would want some upper-bound on the number of clients and we need to be able to Let's verify that we can achieve 20k EPS with no code changes, and record the settings we used. We can refer to those in the future if we are doing tuning. |
I have run few more tests on existing filebeat (no code changes) to check if we can reach much higher throughput just by tuning the existing settings. The follow tests are run with
So, just by tuning Due to lack of throughput improvement, only optimising for contention isn't worth it to pursue this investigation to add multiple outlet (pipeline) clients further. |
GCP Pubsub input has certain bottlenecks which needs to be addressed:
EventNormalization
.beat.Client
becomes a bottleneck because thePublish()
call acquires a lock.beat.Clients
in a pool for each pubsub input instance (like one client per configured num_goroutines). This is similar change to AWS-S3 input which massively increased input performance.Receive()
) each with their ownbeat.Client
The text was updated successfully, but these errors were encountered: