-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default terminate_after for autocomplete requests to unlimited #38073
Comments
Pinging @elastic/kibana-app |
I'm wary of changing the default. We generally don't want to ship features that don't scale out of the box. If we make the setting configurable I think the question we have to ask ourselves is what's worse: A. Killing a cluster unexpectedly and then the admin has to figure out why, find this setting, change it, and get everything running again. or B. A user gets inaccurate results, complains to admin and gives them the details of the warning message, admin increases the setting carefully and makes their user happy.
Do we have a lot of users actually complaining about this? I haven't heard of many, but maybe I'm out of the loop.
Not just resource usage but also usability. Autocomplete needs to be fast in order to be useful. If every update takes a full second I think it will be annoying.
I might be wrong, or my knowledge might be out of date, but I think the timeout may not actually kill the task on the ES side.
If |
Thanks for your input @Bargs
Does the current default setting scale? Because clusters with a lot of data won't provide accurate terms lists.
You are right, but we also have to consider the frequency of these events happening. If 5000 people have to go trough the effort to reconfigure their cluster, it might be worth to have one user having a cluster crash where support can help figuring it out. This might be my missing experience, but the chance of these requests killing the server as in contrast to big dashboards which fires lots of requests with 30s timeouts seems pretty small. I'm aware there will be multiple autocomplete requests per user, but not multiple requests per second per user.
There were at least two support cases I've heard of in the last two weeks and the issue #17388 got some reactions and repeated questions.
Not having complete results is also annoying. I might be wrong here, but I think users tolerate a slightly longer waiting time more than not having the option they are looking for. My point is if we are recommending users to increase the limit as soon as it gets hit, why bother setting it to this value at all? It's just an extra step to configure for them then.
You are right, 1s is no hard upper limit, the actual request might take a bit longer.
If the request times out instead of terminating early because of I totally see your point, the thing which isn't clear to me and which I might have underestimated is how resource intensive and/or slow these kind of queries get on big clusters without the |
We have a 100ms debounce on the suggestions update, so technically there could be 10 requests per user per second.
This I can't answer for certain. @lukasolson and @jpountz helped with these optimizations so they might have some input on that. |
These requests can certainly be very resource-intensive on large datasets, we need some protection anyway in my opinion. |
Awesome, thanks for your input, @jpountz |
I've been going through some issues to understand what the current state of this issue, and others related to this, is. As a user who is encountering the notice to adjust autocomplete settings in Kibana in all my clusters I'd love to be able to address it. From my understanding there is a 1s timeout for populating the fields that go into the control visualization. Is this still a hardcoded value or can it be changed (as far as I can tell the PR to allow this still hasn't made it to master)? Sorry if this isn't the best way to talk about this. |
The value can be configured via kibana.yml setting |
Guess I'm running into another "Not yet in Cloud" situations:
Is there a way of seeing which options are available for cloud deployments? |
cc @elastic/cloud-pms |
Also had similar issue - only workaround was setting |
Was that in Cloud @maggieghamry ? |
@0xtf ECE |
Closing as it seems to be possible to configure |
Currently, ES queries to fetch terms to provide autocomplete suggestions in filter controls and the query bar have a hard coded
terminate_after
limit to keep server load low.With #17388 a setting for this value is introduced to be able to get better results in some circumstances.
IMHO it makes sense to default to no
terminate_after
limit in these requests but provide a setting to set it. This setting basically comes down to a tradeoff between accuracy and resource usage. With a default setting of 100000 forterminate_after
the requests are very fast, but often not accurate which is very annoying for the user because they are not able to select terms they know are present in their dataset.I'm always in favor of preventing a "Killbana" situation, but I think a more sensible default would be to be as accurate as possible and provide the option to re-gain speed in clusters where this becomes a problem. Additionally to
terminate_after
there is also a timeout of 1 second, so the request wouldn't run 30s like requests from visualizations even withoutterminate_after
. So this change doesn't seem like a cluster killer and if it becomes one in certain situations, the warning label in #37643 pops up which hints at the problem.Because this is a breaking behavior change, we should introduce this starting with 8.0 and label it as breaking change.
There are some assumptions in this and I could be totally wrong about them - What do you think @timroes @nreese @Bargs ?
The text was updated successfully, but these errors were encountered: