-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataViews] Cache field attributes in SOs #135898
Comments
Pinging @elastic/kibana-app-services (Team:AppServicesSv) |
Cache invalidation is a hard problem.
The cache does not last longer than a single request on the server. Its not clear to me what problem you're attempting to solve by caching the field list. Field caps api calls should be fast. |
We recently learned that this statement is not entirely true (at least on 7.17). And they can even be really costly to ES to the point of taking it to a halt caused by OOMs. |
The change was implemented with the assurance that field caps api calls are fast. As it turns out, this claim by ES engineers was not properly substantiated. They've since done work to make the claim accurate although I'm not sure how much the 7.17 branch will benefit. The caching of index pattern fields was quickly falling apart as solution field lists grew. It was possible to produce saved objects too large to save. This was happening with 4-5k fields. While performance problems are possible, the vast majority of our users have a better experience with an uncached field list. The performance problems should be resolvable. |
Even if the fields were not indexed? Wow! 🤯
I totally trust your expertise in this matter. If you think it's not worth applying this change. I'm happy to close this issue. At least, we have the documented reasons for not doing it :) One follow-up question though: I noticed many apps (dashboard, discover, visualization, apm, ml) call |
Field caps only returns indexed fields, as best I know.
FWIW I'm more than happy to have a discussion about this even if I'm vigorously defending the current state of things. Generally speaking, I'm happy to reconsider anything about data views. If a good field list caching strategy was presented, I'd be happy to use it.
Huh, I had no idea this was being used as widely as it is. I have no idea why it is either. Are you familiar with any particular usage? As best I can tell this functionality should probably be private. |
I meant, the SO we store: we could store the fields returned by field caps in the
I don't think I have the knowledge to provide a good caching strategy for this. But we could probably have a Spacetime to explore options? 😇
When I was looking at the original SDH that resulted in this issue, I made a quick search in our codebase and found out that those apps are calling |
I'm not following.
Wherever its called, although I think its only being called in the browser.
In this particular example, 20 separate field caps api requests are necessary due to field level security. Each user might get a different field list. |
You mentioned this in a previous comment: The caching of index pattern fields was quickly falling apart as solution field lists grew. It was possible to produce saved objects too large to save. This was happening with 4-5k fields. I'm wondering what the limitation was: the SO size? the number of fields the SO had? Any limits related to the payload byte size?
I just looked at the code and you are totally right! It is cleared only in the browser.
That is a very good point! I can see in the code that we don't use the server-side cached results to reply to those APIs. This makes sense if we take into consideration the Field-level security. I don't think I have any additional valid reasons to cache these. Plus, given the field-level security concerns and the recent improvements ES has done around the performance of that API, I doubt we need to put more effort into this. I'm happy to close this issue if you think likewise :) |
The SO size.
Sounds good. I'm always happy to revisit if something comes up. |
At the moment, the DataViews service provides a caching mechanism to avoid reloading them (and their field caps) on every request.
However, this caching mechanism happens in-memory on the server. So restarts would effectively clear the cache.
Large clusters may have multiple Kibana nodes, multiple indices and many Data Views, which has a snowball effect that multiplies the number of Field Caps API requests that we do.
Should we cache the field attributes in SOs? IMO, it would save us from performing many concurrent requests coming from multiple sources and hitting different Kibana instances. We can look at the SO's
updated_at
value to decide whether to invalidate the cache and fetch it again after a little while.What do you think?
The text was updated successfully, but these errors were encountered: