-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reasons for not using saved objects for storing kibana data #80912
Comments
@kobelb Thanks for starting this discussion. It's been a while since we decided to go with a dedicated system index over saved objects so I might be forgetting some details, and SO might have changed. Overall I think it boiled down to limitations in querying abilities. For agent configuration we need to filter documents using boolean logic and operators like This is an example of the query we make to retrieve an agent configuration: kibana/x-pack/plugins/apm/server/lib/settings/agent_configuration/search_configurations.ts Lines 27 to 66 in 71f4c08
Is this something that's possible today? |
|
For those following along, I recently added Reporting to the above description. They're using their own system-indices because the report output is stored in base64 encoded fields, which creates large documents. |
As of 7.10, Kibana stores session information in the The data is meant to be ephemeral, as Kibana will periodically cleanup sessions that are no longer valid. These indices are not meant to be consumed by end-users directly, and the more interesting contents are encrypted anyway. |
A key reason we opted to use a data index instead of a SO for signals/alerting support here: .siem-signals-* Was that users want to be able to create dashboards and use discover to query against their alerting data which you cannot do with saved objects at this time. Would be nice to have dashboard/first class query support for saved objects like we have for data indexes. |
We wanted to treat annotations like just another index that users can query in Discover, visualize etc. We also wanted to stay ECS compatible (again, to make querying easier). |
@FrankHassanabad and @sqren, if we want end-users to be able to query these indices directly, I wouldn't recommend storing them as saved-objects at this time. However, as I've mentioned elsewhere, I'd recommend storing them as |
That's correct. |
I've added another section "Reasons to not use the saved objects client" Here's some code references to existing code working around the limitations I've mentioned but felt like it bloats the issue description too much: kibana/x-pack/plugins/task_manager/server/task_store.ts Lines 606 to 620 in 5dfa45d
kibana/src/plugins/kibana_usage_collection/server/collectors/kibana/get_saved_object_counts.ts Lines 70 to 72 in 9ca2238
kibana/x-pack/plugins/task_manager/server/task_store.ts Lines 283 to 292 in 5dfa45d
|
Updated the issue now that saved objects supports paging through more than 10k saved objects. I kept the "There are too many saved-objects" section, but changed it to be about the scalability of migrations and export. |
Pinging @elastic/kibana-core (Team:Core) |
A majority of Kibana's entities are persisted in saved-objects. However, there's a growing number of non-saved-object Elasticsearch indices that are being used to store Kibana specific entities. The following are the ones that I'm currently aware of:
.kibana-event-log-*
.apm-agent-configuration
.apm-custom-link
.siem-signals-*
.lists
and.values
.reporting-*
I've started this discuss issue to determine what other Elasticsearch indices are being used to store Kibana specific entities, and enumerate the reasons for why they aren't being stored as saved-objects. Saved-objects provide a number of features including migrations, authorization, audit logging, export/import, space awareness, and encrypted attributes that developers forgo when using non-saved-object ES indices.
I'd like to perform this exercise to ensure that there aren't limitations that should be addressed with saved-objects to make them applicable to other use-cases or figure out which current saved-object specific features should be made available when using non-saved-object ES indices.
Reasons we haven't used saved-objects
End-users should be able to query the indices directly
Saved-objects are stored in a "system index", and as such, end-users will not be able to query these indices directly starting in 8.0. Even if end-users could theoretically query system-indices, we treat the ES document format as an implementation detail of saved-objects, and they're prone to change during minor versions in a non-backward compatible manner, so end-users shouldn't be querying them directly.
Applies to: Alerting's event log, Detection engine signals
There are too many saved-objects
The SIEM team has outlined a few of the issues that they experienced when trying to model their lists using saved-objects in #64715. Notably,
SavedObjectsClient#find
's paging implementation doesn't function properly when there are more than 10k results, which is being tracked by #77961.Applies to: Security solution lists
Documents are too large
Reporting is using its own dedicated
.reporting-*
indices because they include base64 encoded data for the generated CSVs, PDFs and PNGs. Since these documents are generally so large, they can't be migrated using saved-object migrations, and they're created on a weekly basis.Applies to: Reporting
Aggregations
Plugins wanting to run aggregations cannot use the saved objects client (we have made good progress in #64002 but it might take some time for plugins to adopt it).
In addition, it will not be possible to use a query to limit the documents to aggregate over. One workaround is to use a KQL filter, but this impacts performance and is discouraged by the ES team #69172
Applies to: APM Agent Configuration
Filtering on update / delete queries
It's not possible to efficiently delete or update many documents without doing these operations over all documents of a certain saved object type
Filtering on
nested
fieldsFilter validation fails when writing a KQL query for nested field types #81009
The text was updated successfully, but these errors were encountered: