-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An attempt to optimize AsyncConsumerWorkService.WorkPool dispatch loop #352
Conversation
Adding a BlockingCollection to a code path that is supposed to be async defeats the purpose of having an async code path. This PR means the thread is blocked waiting for the collection. |
What problem does this PR solve? I don't think the comment linked answers this question. |
@YulerB has mentioned an improvement on given class. The BlockingCollection will be a good alternative for Task.Delay and for while cycle, which also holds the current thread inefficiently. So basically this PR will speed up consumers. |
Is there hard evidence of this? Benchmarks that were used to demonstrate
the improvement (and the workload tested)?
On Thu, 14 Sep 2017 at 15:22, Vajda Endre ***@***.***> wrote:
@YulerB <https://github.com/yulerb> has mentioned an improvement on given
class. The BlockingCollection will be a good alternative for Task.Delay and
for while cycle, which also holds the current thread inefficiently. So
basically this PR will speed up consumers.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#352 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAEQigOn4nQD62OxWVGySXUzb-w-gXbks5siX0AgaJpZM4PYBi4>
.
--
Staff Software Engineer, Pivotal/RabbitMQ
|
The BlockingCollection isn't blocking the asyncrony. It pauses the execution when the queue is empty. The code is greatly simplified by this PR. It uses less op-codes in user code. It's more readable. It's now something people can understand. The code took a long time to mentally check. The code may have unforseen bugs relating to the try methods failing, that I for one cannot follow. The code reduces the cyclomactic complexity of the method. The blockingcollection was designed for this purpose. As for performance, im sure its going to win out. Allot of the code is smart, but apply the KISS principle where possible. And it's cheaper for all of us to leverage the framework, rather than implement our own. |
@YulerB @vendre21 so are there any benchmarks or profiling data that your team used to come up with this and can share? I can see the argument that this simplifies the code. A |
Please see bench code attached. I had to pull code out of the client to test. Batch script to run to collect results: perfbenchworkpool new single Each is separated to get accurate results, since there seems to be a memory leak in the old version due to excessive lockinig. Results: C:>perfbenchworkpool new multiple C:>perfbenchworkpool old single C:>perfbenchworkpool old multiple |
What we see is when stop is called, the current code doesn't stop until the queue is empty. The updated code will stop when the cancellation token is set to cancel. |
If it’s a bug, put this in the while loop: |
Updated bench and results I realize the blockingcollection is slighly slower now after fixing the bug in the old while loop, but check out how many items got processed. |
@YulerB we intentionally process outstanding consumer operations before stopping. Not doing so will be considered a bug by some. With automatic acknowledgements this can effectively lose/ignore some deliveries, which is not necessarily a major issue given the "fire and forget" nature of that mode but still would be a surprising breaking change. With manual acknowledgements it can result in a certain number of delivered messages to be requeued, which is fine. So I'd really like to preserve the current behavior. |
@YulerB averages and min/max values are not ideal in benchmarks, we switched PerfTest to use median/95th/99th percentile earlier this year, for example. I'd also consider increasing the number of iterations to, say, 10M or 100M to let the benchmark run longer. Thank you for producing it, by the way! |
@bording @danielmarbach so we seem to "win some, lose some here". WDYT of this proposal? Blocking collection use aside, I'd agree that it is a certain simplification. Are there any NServiceBus benchmarks you can run to compare the two approaches? |
@michaelklishin I don't understand how simplification of the code is an argument here. And unfortunately the blocking collection cannot just be left aside in the argument we are having. When you look at concurrency and parallelism how you are using the threadpool is essential. Because every thread that we can free up is worth freeing up because during that time it can work on other things and we reduce the chances of hill-climbing in the thread pool (or ramping up threads). So what it means is with the while loop in combination with the delay while we might have a context switch we are freeing up the thread when the queue is empty. On the other hand the blocking collection uses internally wait handles that entirely block the thread plus some non trivial cancellation token source linking which has other memory implications. When the blocking collection is empty the thread is block and cannot be used. While this might not be an issue when you look at it from a single consumer thread perspective it becomes an issue when you look at the process as a whole containing potentially multiple such consumers. I cannot judge whether you as maintainers are willing to make those trade-offs for the argument of simplicity. I would not. |
@danielmarbach thanks for your feedback. I'd like take a closer look at where this collection is used, since most concurrent systems are snowflakes, even though they are built from common primitives. I have a couple of things I'd like to clarify. In this PR, a blocking collection is used in the consumer work service to "batch" pending operation dispatch in a loop. Can that thread be reused for something else in practice? Good question, I assume the answer is "yes" but I don't know enough about the .NET runtime or TPL task scheduling. There is one Currently this PR doesn't seem to be an obvious improvement even with 5 consumers. You win some on some metrics, you lose some in terms of latency. @YulerB can you please add a few more versions of your benchmark that have 100, 250, 500 and 1000 "consumers" (tasks) sharing a pool? Those numbers may seem unusual but on a system with 16, 32 or greater number of cores, having that many consumers no longer seems crazy. I expect the results can be quite Optimizing for a single consumer is not something our team usually does (even though we see arguments for that from time to time), both in client libraries and in RabbitMQ itself. |
So, if the items in the queue are allowed to complete when stop is called, then we are queueing an action on a ConcurrentQueue to then queue on the thread pool queue, and only need to stop queuing when stop is called. If this is the case, we could skip the workpool and queue directly on the thread pool. internal class AsyncConsumerWorkService : ConsumerWorkService{
readonly SynchronizedList<IModel> workPools = new SynchronizedList<IModel>();
bool go = true;
public void Schedule<TWork>(ModelBase model, TWork work) where TWork : Work {
if (go && !workPools.Contains(model)) Task.Run(() => work.Execute(model));
}
public void Stop(IModel model){
workPools.Add(model);
}
public void Stop(){
go = false;
}
} |
@YulerB |
If it’s truly async, then you cannot guarantee ordering. I reckon, all the operations don't require ordering, maybe only the message delivery, and only for some applications. If this is the case, we should add directly to the threadpool all entries except deliver message which will be peformed sequentially on the background thread. So, there really are 3 use cases,
I've also added await/async for read operations to our version. The BinaryReader is more trouble than its worth. BitConverter is a great classs. |
Async != concurrent
Async != single thread The code above would offload an inherently IO bound problem to the worker pool which is not desirable. When the async consumer service was introduced it was meant as a first step as an enable for asynchronous consumer code. We knew that the current model code is still IO bound but blocking but the consumer service would at least enable existing asynchronous third party code to be executed in an asynchronous way and would allow to more naturally combine such code with the new async enabled APIs. Happy to talk this through but I think we need to clarify a few terminologies here first so that we all talk about the same thing (no offence meant). |
@michaelklishin, yes it would create a race condition if ordering was required. |
@YulerB I don't subscribe to the idea that "if it's truly async you cannot guarantee ordering". You can guarantee per-channel dispatch ordering. Concurrently running consumer operations that require synchronisation is application developer's concern and libraries cannot fully avoid concurrency hazards. |
@YulerB thanks for your time on this but unless we have benchmarks that prove this is more efficient with a very large number of "consumers", this PR has no chance of getting in. It's not an obvious improvement from the basic benchmarks we have and the subtle behaviour changes that were discussed and tested in @danielmarbach's async dispatcher PR are completely overlooked here. |
Also keep in mind that there are plans to develop a new .NET client from
scratch targeting only .NET Core and the most recent C# version
available. We expect that the work will start in Q1 next year. Many
async/await-related design ideas should go there.
…On Mon, Sep 18, 2017 at 6:58 AM, Brian Yule ***@***.***> wrote:
@michaelklishin <https://github.com/michaelklishin>, yes it would create
a race condition if ordering was required.
My use case doesn't require ordering for any of my consumers.
I'm processing tons of messages now concurrently (thanks @danielmarbach
<https://github.com/danielmarbach>).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#352 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAEQprFyE2Q3JgJeAWas2ie_4fREAOsks5sjmjwgaJpZM4PYBi4>
.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
|
Thank you very much everyone's effort on this PR. I gonna close as this was overlooked enough. |
Thank you, Vajda.
We definitely appreciate your interest in improving the client. Your
colleague's other
PR is looking good so far, we just need to add a few new integration tests
and let
dependent projects give it a try before merging.
…On Mon, Sep 18, 2017 at 10:21 AM, Vajda Endre ***@***.***> wrote:
Thank you very much everyone's effort on this PR. I gonna close as this
was overlooked enough.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#352 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAEQp0b8f_g1hgtZSE5iTHs50pBqYnWks5sjpingaJpZM4PYBi4>
.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ
|
Further discussion on this PR:
#350 (comment)