-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve keepalive performance in mongo connection pool #4517
Comments
The mentioned code corresponds to pingConnection(), which is uses only at CB startup, so its impact is very limited.
fiware-orion/src/lib/mongoBackend/MongoGlobal.cpp Lines 228 to 239 in a0d75b3
The getOrionDatabases() function is invoked from subCacheRefresh(). This is done with a frequency of @rg2011 to confirm this theory... could you check if the "slow query" log regarding |
Ref https://www.mongodb.com/docs/manual/reference/command/listDatabases/ Use |
Yes, it's every 60 seconds. |
PR #4530 |
PR has been merged but keep this issue opened while it can be tested in the same environment where @rg2011 detect the problem. |
This has been included in Orion 4.0.0. Pending on a test in the environment before closing the issue. |
Deployed 4.0.0 in prod environment and confirmed decrease in slow queries. Thanks! |
Is your feature request related to a problem / use case? Please describe.
It is related to a performance problem. We have noticed a high rate of slow queries in our mongo deployment, regarding the
listDatabases
command:The log has been redacted for brevity, but I've let the part about the locks in there. It seems that the command acquires 139 - 140 locks, which might be the reason why it is so slow.
The source IP address of these requests belong to Orion servers. We have several of them in our multi-tenant deployment. It seems that fiware-orion uses the
listDatabases
command as a keepalive:fiware-orion/src/lib/mongoDriver/mongoConnectionPool.cpp
Lines 105 to 120 in a0d75b3
Deployed at scale, we are hitting around 300 - 500 ms per each
listDatabases
request, as shown in the log. We would like to propose changing to some lighter command for keepalive, instead oflistDatabases
.Describe the solution you'd like
Stop using
listDatabases
for keepalive in the mongo pool. Replace with a less expensive command.Describe alternatives you've considered
Really not much besides increasing the resources of the mongo servers or splitting the mongo databases across different replicasets, but both options seem much more costly than changing the keepalive method.
Describe why you need this feature
Slow queries have an overall impact on the cluster performance, might be degrading some of the actual work the cluster has to do.
Currently the
listDatabases
queries are not the only slow queries we have, but they amount to roughly 40% - 50% of all the slow queries in the replicaset.Additional information
Do you have the intention to implement the solution
I can help with choosing a new command to use as keepalive in the pool. For instance,
getParameter
might be a good candidate, e.g.db.adminCommand({ getParameter:1, logLevel:1})
.I can also help evaluating the impact on performance once the command is changed.
The text was updated successfully, but these errors were encountered: