Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date histogram aggregation throws an exception when aggreated on an open date_range field. #54228

Closed
krlm opened this issue Mar 25, 2020 · 1 comment

Comments

@krlm
Copy link

krlm commented Mar 25, 2020

Elasticsearch version (bin/elasticsearch --version):
Version: 7.6.0, Build: default/docker/7f634e9f44834fbc12724506cc1da681b0c3b1e3/2020-02-06T00:09:00.449973Z, JVM: 13.0.2

Plugins installed: [kibana]

JVM version:
openjdk 13.0.2 2020-01-14
OpenJDK Runtime Environment AdoptOpenJDK (build 13.0.2+8)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13.0.2+8, mixed mode, sharing)

OS version:
Linux 712aab0ffb37 4.19.76-linuxkit #1 SMP Thu Oct 17 19:31:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
Executing date_histogram aggregation over document with date_range field, with open end (lt and lte doesn't have value) results in a at least in appropriate exception:

CircuitBreakingException[[request] Data too large, data for [<reused_arrays>] would be [805344256/768mb], which is larger than the limit of [633785548/604.4mb]]

I'd expect that in such cases where range is open either, missing options can be used - as it happens for date type - or values are counted in all buckets valid for their start date.

Steps to reproduce:
index:

PUT /events
{
  "mappings": {
    "dynamic": "false",
    "properties": { 
      "dateRange": {
        "type": "date_range"
      }
    }
  }
}

doc:

PUT /events/_doc/1
{
  "dateRange": {
    "gte": "2020-02-20"
  },
}

query:

POST /events/_search?pretty=true&error_trace=true
{
  "aggs": {
    "events-histogram": {
      "date_histogram": {
        "field": "dateRange",
        "calendar_interval": "day"
      }
    }
  },
  "size": 0
}

BTW, is it possible to workaround this with script? I've tried to play with it, but doc['dateRange'].value returns some obscure (i.e. "\u0001©p_䠀?\uDFBF\uDFFF\uDFBF\uDFC0") java.lang.String value and I'm a bit confused.

@krlm krlm changed the title Date histogram query throws an exception when aggreated on an open date range field. Date histogram aggregation throws an exception when aggreated on an open date_range field. Mar 25, 2020
@polyfractal
Copy link
Contributor

Hi @krlm! I'm going to close this as a duplicate of #50109. That said, this is the third time this issue has come up in the recent past, so I'll raise this at our next team meeting to see if we can fix sooner than later. The current plan is to introduce a flag that allows extended_bounds to act as hard boundaries, truncating infinite ranges and avoiding this issue.

BTW, is it possible to workaround this with script? I've tried to play with it, but doc['dateRange'].value returns some obscure (i.e. "\u0001©p_䠀?\uDFBF\uDFFF\uDFBF\uDFC0") java.lang.String value and I'm a bit confused.

Unfortunately there's not a way to deal with ranges in scripts yet. What you're seeing with that obscure value is essentially the way we store the ranges internally, in a binary format. Scripts don't presently know how to deal with it properly, and so just dump to an unusable string. Adding scripting support to ranges is also on the todo list, but that'll take longer because it's a bit more complicated.

Thanks for raising the bug report, please follow #50109 for updates!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants