-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregating _ignored field values #59946
Comments
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
We had an initial discussion and agreed that it seems useful to be able to aggregate over the
|
Thanks for responding!
What's the cost difference for enabling doc_values?
I mean at scale there could be lots of _ignored fields, could the cost be
amplified hugely?
…On Mon, 3 Aug 2020 at 23:50, Julie Tibshirani ***@***.***> wrote:
We had an initial discussion and agreed that it seems useful to be able to
aggregate over the _ignored field. We didn't reach a final conclusion but
plan to continue the discussion. Other notes:
- We'd like to avoid adding another option to the mapping like "doc_values":
true. However we weren't sure if it was worth the cost to always
enable doc_values.
- One idea was to always enable doc_values, but stop adding the field
as a stored field. The fetch phase would switch to retrieve _ignored
from doc values.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#59946 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPXFWN4F6HGRMMLISJO7XDR64PJNANCNFSM4PDHS2KQ>
.
|
We recently started looking at using |
@jimczi I addressed it. I hope I didn't misunderstand you. |
This came up again as we are using the |
I came across this issue while looking at the '_ignored' field mapper for #78981 I came across this issue and was wondering if we still want to pursue it. The following ideas came up during chatting:
If we find a solution that uses the existing way that '_ignored' is stored using "stored fields" and aggregate using runtime fields, this would have the additional benefit of working with existing indexes. |
I played around a bit more with runtime field scripts and think I missed an option. Using:
I now was able to aggregate over at least one value from the ' _ignored' field. I didn't get multiple values working (which would be necessary for an aggregation I think), but that might also already be possible with the right scripting magic. |
I think
We'd talked about a version of emit that took a list but never built it. It'd more complex than it sounds like. |
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
When dealing with lots of data and lots of users, having the
ignore_malformed
option is great!And using it in combination with _ignored field can give a lot of information about what is wrong with the data.
But unfortunately we can't aggregate on it, so it's hard to give an overview of which fields had issues in the last X hours.
I think it would be very useful to add .keyword for _ignored field, or make the _ignored field tokenised.
The text was updated successfully, but these errors were encountered: