Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TSVB] Index patterns are incorrectly cached #96901

Closed
mattkime opened this issue Apr 12, 2021 · 10 comments · Fixed by #97043
Closed

[TSVB] Index patterns are incorrectly cached #96901

mattkime opened this issue Apr 12, 2021 · 10 comments · Fixed by #97043
Assignees
Labels
Feature:TSVB TSVB (Time Series Visual Builder) Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@mattkime
Copy link
Contributor

mattkime commented Apr 12, 2021

I was looking into a TSVB related performance problem when I came across this - https://github.com/elastic/kibana/pull/92812/files#diff-1b2036375d883a146d1c7e25c3d366dd1cb0ce30574ef4da7ce6d222ca6f4d23 - an endpoint specific index pattern cache. It addresses this issue - #93751

It appears that index patterns are cached by id. Unfortunately ids are not unique across spaces. This is a security vulnerability.

Further, I don't see a method of cache invalidation. Am I missing it?

It looks like the field list is loaded manually, addressing concerns about field level security, although its unclear from the code why the field list is being loaded in a non standard manner.

How should this be addressed?

First I'm curious about the test - are the index pattern saved objects large? Do they have full field lists? If so, it would be good to strip out the field list and try the test again.

Its interesting that caching the index pattern instance resolves the performance concern ALTHOUGH the field list is still loaded. That means the performance concern is related to loading the saved object or instantiating the index pattern instance. The field list should be the bulk of the index pattern size.

Taking a step back - server side index pattern caching should be handled on the service level. Any team that loads index patterns on back end calls would run into this problem.

@mattkime mattkime added bug Fixes for quality problems that affect the customer experience Feature:TSVB TSVB (Time Series Visual Builder) v8.0.0 v7.13.0 labels Apr 12, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Apr 12, 2021
@mattkime mattkime added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Apr 12, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Apr 12, 2021
@wylieconlon
Copy link
Contributor

I think this would be a blocker if we don't fix it before release.

@wylieconlon
Copy link
Contributor

We can probably fix the caching without reverting #92812

@mattkime
Copy link
Contributor Author

I spoke to core - they say that loading saved objects should be fast. Perhaps some of the index pattern initialization code is slow. It starts at src/plugins/data/common/index_patterns/index_patterns/index_patterns.ts with the getSavedObjectAndInit function.

@alexwizp alexwizp self-assigned this Apr 13, 2021
@alexwizp
Copy link
Contributor

alexwizp commented Apr 13, 2021

Not sure if the problem reported is happening on the TSVB side. Yes, we added some caching logic for TSVB, but let me emphasize that this caching works at the request level. This means that for each request, its own cache object is created, which will be automatically cleared by GC.

I want to make it clear that TSVB works with both low-level ES indices and Kibana indices. This explains why we cannot always use the standard approach.

Next I have some comments, @mattkime please correct me if I'm wrong:

It appears that index patterns are cached by id. Unfortunately ids are not unique across spaces. This is a security vulnerability.

It's not a problem, as a told you above our cache associated with user request and recreated per each request. Again we need it for optimizing performance cause some series can use the same index and field list.

Further, I don't see a method of cache invalidation. Am I missing it?

This object should be automatically reclaimed by the GC.

It looks like the field list is loaded manually, addressing concerns about field level security, although its unclear from the code why the field list is being loaded in a non standard manner.

We use the "standard way" to load the list of fields if the user is using "Kibana Index Mode". For low-level ES indexes we have only one way - getFieldsForIndexPattern from the data plugin. Is it wrong?

Taking a step back - server side index pattern caching should be handled on the service level. Any team that loads index patterns on back end calls would run into this problem.

Good idea, but e.g. 'find' method from the indexPatterns service works without caching. This and other prompted us to save the object on the TSVB side in a form that is convenient for us.

I think if we have some performance degradation or wrong caching it happened on data plugin side but we can discuss it....

@alexwizp
Copy link
Contributor

Maybe it's somehow related #87116

@alexwizp
Copy link
Contributor

To be honest, I think that the problem is that we do not have caching between requests. As a result, if the user adds several TSVB panels to the dashboard, then we are sending to request the same data for each panel.

But then again, if this is true, the solution should be service-level ...

@mattkime
Copy link
Contributor Author

@alexwizp It seems the error was in my understanding. Apologizes. Could you add some comments to the code so the intentions are clear in the future?

@mattkime mattkime removed the blocker label Apr 13, 2021
@alexwizp alexwizp removed :KibanaApp/fix-it-week bug Fixes for quality problems that affect the customer experience v7.13.0 v8.0.0 labels Apr 13, 2021
@dmlemeshko
Copy link
Member

To be honest, I think that the problem is that we do not have caching between requests. As a result, if the user adds several TSVB panels to the dashboard, then we are sending to request the same data for each panel.

I recently opened #96629 , just wonder if it describes the case you mention.

@wylieconlon
Copy link
Contributor

wylieconlon commented Apr 13, 2021

@dmlemeshko the problem of duplicate requests is a separate problem. I'd like to keep this discussion focused exclusively on the getCachedIndexPatternFetcher.

I just tested on master, and what I'm seeing is that the cachedIndexPatternFetcher is not caching anything. I added logging to see cache hits and cache misses, and then uninstalled and reinstalled the sample data for ecommerce. What I saw is that there were never any cache hits, only misses. Can we agree that this is the problem that should be solved?

Edit: it looks like the reason nothing is being cached is that the keys don't match. We are doing cache.set and cache.has on different keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:TSVB TSVB (Time Series Visual Builder) Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants