-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate _field_stats endpoint #23914
Conversation
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field. In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures). Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field. This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently. For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why this PR is made directly to 6.0. The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
@@ -59,7 +59,14 @@ setup: | |||
|
|||
--- | |||
"Basic field stats": | |||
- skip: | |||
version: " - 5.99.99" | |||
reason: Deprecation was added in 6.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused: are we deprecating in 5.4 or 6.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it to not break tests until this is backported to 5.4 like you mention in the PR description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry for the confusion. I should have added a note here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field. In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures). Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field. This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently. For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why this PR is made directly to 6.0. The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
Kinda late to the party on this one, but can we reconsider the name |
https://www.elastic.co/blog/managing-time-based-indices-efficiently mentions using _field_stats to find indices that are old. With _field_stats gone, is there a way to quickly get the list of indices that are older than a certain point? I think the min/max aggregation would require querying every single index (or at least all the potentially old indices) to see if it is old enough. |
We are using this to get the min and max for a field without having to do a search. Unless the old version was already implemented by doing a search, isn't changing our code to an aggregation just going to make it slower? |
The field stats API was just looking at index stats, while aggs do a search all the time, so aggs will indeed be slower. We could look into making aggs look at index statistics too when the query matches all docs and there are no deletions. |
Kibana 5.4+ is using the In our case, running an aggregation is not desirable because we have one billion of documents each day. Is |
@trevan Hi, I want to get the list of indices in the specified time range by timestamp field in Elasticsearch 6.3. How can I do this without
|
@ncepuwanghui, the only work around I have is to do a search request with an aggregation of the indices and the min/max of the timestamp in each index. It is a LOT slower than the _field_stats so we are actively trying to get away from that workflow. |
We were in need of this functionality as well and added it back via a plugin: https://github.com/sematext/elasticsearch-field-stats Adding the link here in case it's useful for someone. |
_field_stats
has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.In the mean time a more focused API called
_field_caps
has been added, this enpoint is a good replacement for_field_stats
since he can retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates
_field_stats
endpoint. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why this PR is made directly to 6.0.The rest tests have also been adapted to not throw an error while this change is backported to 5.4.