Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use missing in terms aggregation for non string fields #17454

Closed
timroes opened this issue Mar 29, 2018 · 5 comments
Closed

Cannot use missing in terms aggregation for non string fields #17454

timroes opened this issue Mar 29, 2018 · 5 comments
Labels
enhancement New value added to drive a business result Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@timroes
Copy link
Contributor

timroes commented Mar 29, 2018

We currently have the "missing" and "group other" option enabled in terms aggregation for any field. Unfortunately when selecting a non string field, it will throw the following error, since we cannot use __missing__ as a bucket key for non string fields.

Numeric field

Error: Request to Elasticsearch failed: {"error":{"root_cause":[{"type":"number_format_exception","reason":"For input string: \"__missing__\""}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"logstash-0","node":"vF6KrqTxREOLVwtP_slGqw","reason":{"type":"number_format_exception","reason":"For input string: \"__missing__\""}}],"caused_by":{"type":"number_format_exception","reason":"For input string: \"__missing__\"","caused_by":{"type":"number_format_exception","reason":"For input string: \"__missing__\""}}},"status":400}
    at http://localhost:5601/ezz/bundles/commons.bundle.js:82597:36
    at Function.Promise.try (http://localhost:5601/ezz/bundles/commons.bundle.js:66218:20)
    at http://localhost:5601/ezz/bundles/commons.bundle.js:66187:25
    at Array.map (<anonymous>)
    at Function.Promise.map (http://localhost:5601/ezz/bundles/commons.bundle.js:66186:28)
    at callResponseHandlers (http://localhost:5601/ezz/bundles/commons.bundle.js:82569:20)
    at http://localhost:5601/ezz/bundles/commons.bundle.js:82091:14
    at processQueue (http://localhost:5601/ezz/bundles/vendors.bundle.js:184286:37)
    at http://localhost:5601/ezz/bundles/vendors.bundle.js:184330:27
    at Scope.$digest (http://localhost:5601/ezz/bundles/vendors.bundle.js:185468:15)

IP field

Error: Request to Elasticsearch failed: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"'__missing__' is not an IP string literal."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"logstash-0","node":"vF6KrqTxREOLVwtP_slGqw","reason":{"type":"illegal_argument_exception","reason":"'__missing__' is not an IP string literal."}}],"caused_by":{"type":"illegal_argument_exception","reason":"'__missing__' is not an IP string literal.","caused_by":{"type":"illegal_argument_exception","reason":"'__missing__' is not an IP string literal."}}},"status":400}
    at http://localhost:5601/ezz/bundles/commons.bundle.js:82597:36
    at Function.Promise.try (http://localhost:5601/ezz/bundles/commons.bundle.js:66218:20)
    at http://localhost:5601/ezz/bundles/commons.bundle.js:66187:25
    at Array.map (<anonymous>)
    at Function.Promise.map (http://localhost:5601/ezz/bundles/commons.bundle.js:66186:28)
    at callResponseHandlers (http://localhost:5601/ezz/bundles/commons.bundle.js:82569:20)
    at http://localhost:5601/ezz/bundles/commons.bundle.js:82091:14
    at processQueue (http://localhost:5601/ezz/bundles/vendors.bundle.js:184286:37)
    at http://localhost:5601/ezz/bundles/vendors.bundle.js:184330:27
    at Scope.$digest (http://localhost:5601/ezz/bundles/vendors.bundle.js:185468:15)

Date field

failed to parse date field [missing] with format [strict_date_optional_time||epoch_millis]

I think we could try to use other keys for non string fields, that matches the key type. Until we figured out if we can work around those, we should disable the settings for non string fields.

@timroes timroes added bug Fixes for quality problems that affect the customer experience Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Mar 29, 2018
@LucaWintergerst
Copy link
Contributor

My suggestion would be to leave it up to the user to choose the desired key, as this will depend on the data he is looking at. A warning that this can lead to incorrect results if the key appears in the data should also be added.
Another addition could be to "translate" that key into missing if desired (e.g the users choses 0 as his missing key, we will run the agg with "missing": 0, but then display the result of the bucket 0 as missing

@timroes
Copy link
Contributor Author

timroes commented Mar 29, 2018

The problem is, that we set the "missing label" and the user can specify that freely already, but we would need a way to specify the real bucket key we want to use. I am not sure if the user should be able to specify that, because that is something rather technical that need to understand some ES knowledge. But auto generating it, will also be hard, like what valid IP would we generate for the missing IP bucket. Maybe we should just enforce, that the label the user specifies would be a valid value for that type, and then use the value also as the key. I would like to wait for @ppisljar for some discussion around that topic, since he built that feature initially and is way more in-code than I am.

@timroes timroes added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 16, 2018
@ypid-geberit
Copy link

As a workaround, the "JSON Input" under "Advanced" can be used for now. Example for a boolean field:

{
  "missing": "true"
}

@timroes timroes added enhancement New value added to drive a business result Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) and removed bug Fixes for quality problems that affect the customer experience Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Oct 24, 2018
@ghudgins
Copy link
Contributor

you can re-cast data types using runtime fields to work around this issue.

@ghudgins
Copy link
Contributor

don't believe there is plans to address this within Elasticsearch at this time. closing this issue with the above workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

4 participants