-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make the outbox batch size configuable. #1209
Comments
Thanks @JakkeFiejest, this is a good idea. I actually peeked at the code and found that the query is already parameterized, we just have not exposed an API for setting that value. I'm not sure when we'll get around to doing it, as it would probably need to be grouped with other improvements to persistence stuff. As a workaround, I actually would encourage you to disable Outbox cleanup entirely within the endpoint and implement the cleanup as a SQL Agent task to run on a schedule. The query is pretty simple and aside from being tweakable to the batch size that works for you, it also has the advantage of having one canonical place to run. When an endpoint is scaled out, the individual endpoint instances don't have any way to coordinate and so they end up competing to clean the outbox, which is arguably even worse for the DTU model. |
Will this be tackled soon? small side remark, on azure database with a dtu model, you cannot schedule tasks (we found a way around it with a cron-job) |
For reference, the batch size for the outbox cleanup was changed from 10_000 to 4_000 (along with other lock related improvements) in: |
FYI: https://github.com/Particular/NServiceBus.Persistence.Sql/releases/tag/7.0.3 was released. Does that bring any benefits? Using larger values will be more likely to result in deadlocks. Even lower values would result in each cleanup interval to even require more sequential cleanup queries. @JakkeFiejest Even if we expose the batchsize making it smaller will only increase the number of queries for each cleanup interval. Could you please clarify to what value you would want to set it and what behavior requirements you would have when using that value? |
Hi,
At the moment we run the cleanup in a different application (because the
Azure DTU model does not support scheduling tasks)
But there we see benefits from the rowlock (we also use it in the cleaup)
and also the benifit from better using the query plan executer.
However, the big reason why the setting the cleanup interval size could be
useful, is because the DTU model allows you to scale the database to your
throughput, etc...
however deleting 4000 records from a heap table will screw up the dtu
model, because the transaction log that you have, is to small.
So basically you want to set your delete size to match what the dtu's allow.
(but the changes do improve the issues with the application crashing due to
outbox delete timeouts, because not all apps in our microservices landscape
have the outbox cleanup out of process, only the business critical ones)
KInd regards,
Jan
Op do 16 nov 2023 om 17:40 schreef Ramon Smits ***@***.***>:
… FYI:
https://github.com/Particular/NServiceBus.Persistence.Sql/releases/tag/7.0.3
was released. Does that bring any benefits?
Using larger values will be more likely to result in deadlocks. Even lower
values would result in each cleanup interval to even require more
sequential cleanup queries.
@JakkeFiejest <https://github.com/JakkeFiejest> Even if we expose the
batchsize making it smaller will only increase the number of queries for
each cleanup interval. Could you please clarify to what value you would
want to set it and what behavior requirements you would have when using
that value?
—
Reply to this email directly, view it on GitHub
<#1209 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJGG3KDNA6EMJVLPDJ5G2B3YEY6V7AVCNFSM6AAAAAAYHOUTWGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJUHAZDAMRTHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@JakkeFiejest So you would want to increase the size? The problem with increasing is that this likely will result in lock escalation. You likely also suggest to then not only increase the batch size but also the cleanup interval duration so that is runs less? Do you have example size for different DTU's or have a link to documentation on this? |
Describe the suggested improvement
At the moment the batch size when the outbox is cleaned up, is set to 10000.
Changing this value is not explosed in the public configuration.
This however causes issues when using Azure SQL-servers in a DTU model.
These DTU models are a super nice fit in for altering commands and we can easiliy grow the application and database (more load = more dtu's and more logical message processors)
It also allows for a good implementation to have one datasource per service.
But all those benifits gow out the window as soon as the outbox starts cleaning up and needs to clean more then 10000 messgaes in one cycle.
This is because cleaning that amount of messages takes up all DTU's when creating the log to rollbak the query.
And it can have bad results causing even a complete crash of the messaging pipeline when we have to much consecutive fails.
At the moment we only have two workarrounds: massivly overscaling the database or disabling the cleanup and moving it to a dedicated process/sql job. which are both lesser options.
It would be better if the batch size could be exposed via the endpoint configuration, escpecially because the value can be set via the internal constructor.
This whay the outbox can be also configured/tweaked with the expected load of the service.
Additional Context
No response
The text was updated successfully, but these errors were encountered: