-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support large saved object indices consuming 10s of GBs #147852
Comments
Pinging @elastic/kibana-core (Team:Core) |
First, I'd like to understand which reasonable maximum size we would expect to cover 99.9% of our customers usages. If we're talking about 100GB, my gut feeling would be that increasing the sharding count to 2 or 3 by default could be a very acceptable and pragmatic compromise? Also, do we have any guess on the per-type repartition for large usages of saved objects? Because if it's not just one type taking 90% of the total size, splitting our indices per group of types (as we're currently discussing) would also help here.
What about environments where this may not be acceptable to have downtime? I feel like this single statement makes this option a no-go, wdyt?
I agree that if nothing else works, it may be something that we would have to look at. The implications are so significant though, at various level of the SOR and migration systems, that we would need a very strong reason to take the option as being worth it ihmo. |
I'm nervous about this approach mainly because it sounds so similar to the "100K saved objects migration takes 10 minutes" capacity guessing problem that led to us wanting to restrict migrations. My question with this kind of thing would always be "what happens if a customer has 10x the upper limit of our expectations"? Is there a simple workaround for that scenario? |
I agree with Pierre that (1) would at least buy us some time. Manually resharding the index would always be a last resort workaround (for 10x or 100x the data size) but the downtime might be a non-negotiable for users. But 1-3 aren't really good long term options. I've added (4) and (5) which are options we're exploring with the Elasticsearch team. |
While #144035 will reduce the upgrade downtime of clusters with millions of saved objects large indices with GBs of data introduce new challenges. The general Elasticsearch guidance is to ensure that shards are between 10GB to 50GB in size while the saved object indices always use 1 shard only. We currently only have a handful of customers with .kibana indices > 10GB but this is likely to increase.
There are a few options to mitigate this problem:
This consumes unecessary shards for small clusters but improves the scalability for larger clusters.
Because this requires a reindex this will cause downtime. Given that the reason for a re-shard is a large cluster such downtime would be significant.
The text was updated successfully, but these errors were encountered: