More flexible script query #21479

nik9000 · 2016-11-10T20:14:07Z

Now that Kibana supports painless, some strange things have come up around filtering. This is a proposal to implement a thing in Elasticsearch so Kibana doesn't have to do upsetting textual manipulation to user generated script fields.

Background: Kibana lets users configure script fields once and reuse them. The specific usage that is troubling is filtering which Kibana implements by creating a query like this:

                "script" : {
                    "script" : {
                        "inline" : "($scriptText) == params.value",
                        "lang"   : "painless",
                        "params" : {
                            "value" : "whatever the user picked"
                        }
                    }
                }

Specifically, the ($scriptText) == params.value part is scary because script fields that are perfectly valid painless most places are not valid in this script context. Scripts that contain return, ;, or variable declarations are in trouble with a construct like this.

I propose a new version of the script query that'd look like this:

                "script" : {
                    "script" : {
                        "inline"  : $scriptText,
                        "lang"    : "painless",
                        "matches" : "whatever the user picked"
                    }
                }

Instead of casting the return of the script to a boolean we cast it to whatever type is sent in the matches field and then check for equality.

The text was updated successfully, but these errors were encountered:

nik9000 · 2016-11-10T20:19:21Z

@clintongormley I'd like to fast track this a bit because I feel like doing text manipulation on the query is a bit fraught.

I also figure that if Kibana does this thing someone else might be doing it as well so it is likely worth fixing in a nice way.

I certainly don't want to rush it though.

I also think it is a thing we could build with our "contextless" scripts today and we could improve with the context work @jdconrad is doing. With "contextless" scripts we'd do a checked on the result from the script before comparing. With script context we'd force the return types to match so the script engine'd make the user perform the appropriate casts and give compile errors if they don't.

Bargs · 2016-11-10T20:36:16Z

I chatted with @nik9000 about this and agreed it would be nice to have first class support for this in the ES API.

We could work around this issue in Kibana by wrapping the user's script in a function, but as Nik said such textual manipulation is fragile. For one it would only work in painless, or we would have to have to hardcode the syntax for functions in every possible language the user might have enabled. We already do that in some cases but it's hacky and a nice side effect of Nik's proposal is that it would allow us to remove those existing conditionals from the code.

Nik also mentioned that the error reporting would be much worse if we wrapped the script in a function.

jdconrad · 2016-11-10T22:34:41Z

Hey @nik9000, would you mind clarifying what you mind by the following: "Specifically, the ($scriptText) == params.value part is scary because script fields that are perfectly valid painless most places are not valid in this script context." What script context is this, and what makes it invalid? Thanks.

nik9000 · 2016-11-10T22:38:15Z

What script context is this, and what makes it invalid?

Say you have a script that looks like return 1;. It is valid painless. But when Kibana builds it as a filter you get a script that looks like (return 1;) == params.value which is all kinds of not going to work.

My point is that you can't do string manipulation to turn a painless script that returns a number into one that compares that number to another number. Sure, it works sometimes, but in general it doesn't. And it is super confusing.

jdconrad · 2016-11-10T22:39:29Z

Got it. Thanks!

rjernst · 2016-11-10T23:17:12Z

Kibana lets users configure script fields once and reuse them

This sounds like stored scripts. Why aren't they using that?

the ($scriptText) == params.value part is scary because script fields that are perfectly valid painless most places are not valid in this script context

But they will return a compile error if the user has something bogus there like return 1;, so what is the problem?

Bargs · 2016-11-11T00:13:25Z

This sounds like stored scripts. Why aren't they using that?

Not sure about that since I wasn't around when kibana scripted fields were implemented but my guess is that either stored scripts didn't exist at the time, or it was simply easier to store the script in .kibana since we needed to maintain a master list of kibana scripted fields and metadata about each one anyway.

But they will return a compile error if the user has something bogus there like return 1;, so what is the problem?

Think of Kibana scripted fields as a separate, higher level of abstraction than scripted fields in Elasticsearch. The user writes a single script, gives it a name, and we display it alongside all the regular fields anywhere they appear in the UI. From the Kibana user's perspective it's almost like a real field - they can search on it, filter on it, display its values, and aggregate on it. Under the hood we're taking that one script they wrote and using it in a bunch of different APIs, along with a little of the above mentioned magical text manipulation to get filtering to work. That worked fine when we only allowed lucene expressions.

Now with a painless (or groovy, etc) script that's more than a single expression, the user is going to get a nasty compile error when they click the little filter icon in the UI. Users expect that functionality to work and we can't limit them to single expression scripts because that would seriously reduce the benefits of having painless in the first place.

rjernst · 2016-11-11T00:18:45Z

Then it should be done as a function in the generated wrapper script. I don't think we should complicate the script apis for this.

jdconrad · 2016-11-11T00:29:52Z

Trying to reuse a script across multiple contexts doesn't really make sense in a lot of cases. I know this is a bummer for our users, but a user should understand what each script does in each context. Though some scripts may be similar, it makes sense to have one per.

nik9000 · 2016-11-11T02:33:22Z

Trying to reuse a script across multiple contexts doesn't really make sense in a lot of cases

Kibana has been doing this for quite a while. It works well in aggregations and script fields, I believe. They do it for filters too but they use this string manipulation hack. Those three cases have fairly similar contexts so long as you stick to doc values.

I want to give Kibana some way so that it can stop this brittle textual manipulation of the scripts. It make filters super brittle with painless. I don't want Kibana to go deeper down the textual manipulation of user script hole because we'll end up with some twisty stuff and it still won't work well.

@rjernst, what do you mean by "generated wrapper script"? Like a script that delegates to another script for one of it's inputs? The sounds fairly complex but might end up being a better solution. Something like:

                "script" : {
                    "script" : {
                        "inline" : "params.dv == params.value",
                        "lang"   : "painless",
                        "params" : {
                            "dv" : {
                              "script": {
                                "inline": $scriptText,
                                "lang": "painless"
                              }
                            },
                            "value" : "whatever the user picked"
                        }
                    }
                }

That'd solve the brittleness problem and be fairly interesting from a compostability perspective. It seems like a fairly heavy, abstract solution to this problem but it'd work. I'm not sure how I feel about it.

jdconrad · 2016-11-11T02:54:58Z

This is exactly the situation I'm talking about. They're using a script here where it doesn't make sense. The user should make a separate script for this that is expected to return a boolean. The value can be passed in automatically as a parameter.

Bargs · 2016-11-14T23:54:17Z

Asking the user to copy paste their script with a slight modification for filtering isn't a viable solution IMO. It'll lead to subtle copy paste errors. It also isn't scalable. What happens when there's another slightly different context we need to support? We ask them to copy/paste 3 times, then 4 times, etc? Not a good user experience.

I've also been playing with the idea of a function wrapper workaround as mentioned in my previous comment and I don't think it's going to work. As far as I can tell Painless requires functions to have return type declarations, and we can't infer that in Kibana so we can't auto generate a wrapper function for the user's script.

I'm not married to the original solution proposed in this ticket but I think we need some way to take a single script which produces a return value and apply it in multiple contexts. From the user's perspective there shouldn't be a difference between a script that has a single expression and one that has some logic and a return statement.

nik9000 · 2016-11-15T17:32:35Z

I talked with @jdconrad about this and I convinced him that the places where Kibana is reusing the same script are similar enough that it is pretty reasonable to reuse the scripts. Places like the script field in the terms aggregation. The script filter is the most weird place where the scripts are reused because it returns the wrong thing.

We want to remove the need for the brittle text manipulation but we aren't yet sure of the nicest way to do it.

Multi-statement scripts are currently incompatible with the way we create filters on scripted fields. We're investigating the possibility of enhancing elasticsearch to fix this issue, but we may have to fallback on showing users a warning if we can't reach an agreement in the near term. See elastic#9024 Related elastic/elasticsearch#21479

Bargs · 2016-11-15T23:15:10Z

It might be cool if we could treat the user's script as a named function, so that we could create the filter like so:

"script" : {
    "script" : {
        "inline" : "userScript(doc) == params.value",
        "lang"   : "painless",
        "params" : {
            "value" : "whatever the user picked"
        }
    }
}

That might give us a lot of flexibility to use the provided user script in interesting ways down the road as well. It would be super swell if it didn't matter what language the user script is written in, but without knowing how this is all implemented I'm probably just spewing crazy talk.

s1monw · 2016-11-16T08:16:30Z

I might miss a thing but would it make sense to specialize that particular usercase userScript(doc) == params.value in ScriptQuery? we already have this:

  leafScript.setDocument(doc);
Object val = leafScript.run();
if(val instanceof Number) {
  return ((Number) val).longValue() != 0;
}

if this is the usecase we can maybe turn it into:

  leafScript.setDocument(doc);
Object val = leafScript.run();
if(val instanceof Number) {
  return ((Number) val).longValue() == userSpecified; // we gotta make it a bit more complicated by bascially that's it...
}

then we can just use the script and compare it internally without any API being more complicated?

nik9000 · 2016-11-16T15:08:26Z

@s1monw I think that fairly similar to what I proposed first....

jpountz · 2016-11-16T15:44:15Z

I don't feel good about this feature: making it easy to match scripts against exact values feels wrong to me since it means that the data was not indexed correctly. I think it is fine for one-time usage, but here we are discussing the ability to search on reusable scripts, which means the user should have an index pipeline that creates a proper field based on this script so that querying it is fast.

I think it's great to have the script query exposed in Kibana for these one-time queries that one did not think about at index time. But I don't think we should aim at making it easy to wire existing reusable scripts and the script query. This looks like an anti-pattern to me.

nik9000 · 2016-11-16T16:50:05Z

I don't feel good about this feature: making it easy to match scripts against exact values feels wrong to me since it means that the data was not indexed correctly.

Indeed the fact that this is just a couple of clicks in Kibana is a bit troubling. It isn't obvious while you are doing it that this might have non-trivial performance cost. Part of what makes this not so bad in Kibana is that to get to the page where you make the selection we've already shown you an aggregation across the entire date range so the user has already waited once. And we've potentially already lit up the disk cache for the field.

alexbrasetvik · 2016-11-17T06:47:45Z

Seconding @jpountz here, this looks like an easy way to make a seemingly innocuous query become super-expensive. In my experience, scripts are a common source of unexpected (to the novice user) CPU-cost, as a user doesn't necessarily think of the massive cost difference in execution of {"term": {"foo": "bar"}} vs. "doc['foo'].value == params.value". If this happens behind the scenes in Kibana, it's even less likely that the user considers (or is even aware) of this.

clintongormley · 2016-11-17T08:08:12Z

Another issue here is that script fields (will) run in a different context to script queries: scripted fields should have access to the _source, while script queries should only have access to doc[], so there is no guarantee that a scripted field would work as a script query in the future.

Multi-statement scripts are currently incompatible with the way we create filters on scripted fields. We're investigating the possibility of enhancing elasticsearch to fix this issue, but we may have to fallback on showing users a warning if we can't reach an agreement in the near term. See #9024 Related elastic/elasticsearch#21479

Backports PR #9090 **Commit 1:** Add a note on multi-statement scripted field limitation Multi-statement scripts are currently incompatible with the way we create filters on scripted fields. We're investigating the possibility of enhancing elasticsearch to fix this issue, but we may have to fallback on showing users a warning if we can't reach an agreement in the near term. See #9024 Related elastic/elasticsearch#21479 * Original sha: b4aaed8 * Authored by Matthew Bargar <[email protected]> on 2016-11-15T22:47:26Z

nik9000 · 2016-11-21T20:25:51Z

After an over video discussion we came the to the conclusion that #21635 is good enough for Kibana and hopefully should be good enough for other users. This is an example of doing a comparison. You use text manipulation to wrap the scripted field in a lambda and run the comparison. This isn't perfect - if the script you are wrapping defines a function then it'll fail, but this is much simpler on our side than any of the other proposed solutions and it has the advantage of making it difficult to do a potentially slow thing. And that is kind of a good thing.

Backports PR elastic#9090 **Commit 1:** Add a note on multi-statement scripted field limitation Multi-statement scripts are currently incompatible with the way we create filters on scripted fields. We're investigating the possibility of enhancing elasticsearch to fix this issue, but we may have to fallback on showing users a warning if we can't reach an agreement in the near term. See elastic#9024 Related elastic/elasticsearch#21479 * Original sha: b4aaed8 * Authored by Matthew Bargar <[email protected]> on 2016-11-15T22:47:26Z Former-commit-id: 24840a9

nik9000 added :Query DSL discuss >enhancement labels Nov 10, 2016

Bargs mentioned this issue Nov 15, 2016

Add a note on multi-statement scripted field limitation elastic/kibana#9090

Merged

elastic-jasper mentioned this issue Nov 17, 2016

[5.x] Add a note on multi-statement scripted field limitation elastic/kibana#9121

Merged

elastic-jasper mentioned this issue Nov 17, 2016

[5.0] Add a note on multi-statement scripted field limitation elastic/kibana#9122

Merged

jdconrad added the :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache label Nov 17, 2016

jdconrad mentioned this issue Nov 17, 2016

Fix Lambdas in Painless to be Able to Use Top-Level Variables Such as params and doc #21635

Merged

nik9000 removed the discuss label Nov 21, 2016

nik9000 closed this as completed Nov 21, 2016

clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More flexible script query #21479

More flexible script query #21479

nik9000 commented Nov 10, 2016

nik9000 commented Nov 10, 2016

Bargs commented Nov 10, 2016

jdconrad commented Nov 10, 2016 •

edited

Loading

nik9000 commented Nov 10, 2016

jdconrad commented Nov 10, 2016

rjernst commented Nov 10, 2016

Bargs commented Nov 11, 2016

rjernst commented Nov 11, 2016

jdconrad commented Nov 11, 2016 •

edited

Loading

nik9000 commented Nov 11, 2016

jdconrad commented Nov 11, 2016 •

edited

Loading

Bargs commented Nov 14, 2016

nik9000 commented Nov 15, 2016

Bargs commented Nov 15, 2016

s1monw commented Nov 16, 2016

nik9000 commented Nov 16, 2016

jpountz commented Nov 16, 2016

nik9000 commented Nov 16, 2016

alexbrasetvik commented Nov 17, 2016

clintongormley commented Nov 17, 2016

nik9000 commented Nov 21, 2016

More flexible script query #21479

More flexible script query #21479

Comments

nik9000 commented Nov 10, 2016

nik9000 commented Nov 10, 2016

Bargs commented Nov 10, 2016

jdconrad commented Nov 10, 2016 • edited Loading

nik9000 commented Nov 10, 2016

jdconrad commented Nov 10, 2016

rjernst commented Nov 10, 2016

Bargs commented Nov 11, 2016

rjernst commented Nov 11, 2016

jdconrad commented Nov 11, 2016 • edited Loading

nik9000 commented Nov 11, 2016

jdconrad commented Nov 11, 2016 • edited Loading

Bargs commented Nov 14, 2016

nik9000 commented Nov 15, 2016

Bargs commented Nov 15, 2016

s1monw commented Nov 16, 2016

nik9000 commented Nov 16, 2016

jpountz commented Nov 16, 2016

nik9000 commented Nov 16, 2016

alexbrasetvik commented Nov 17, 2016

clintongormley commented Nov 17, 2016

nik9000 commented Nov 21, 2016

jdconrad commented Nov 10, 2016 •

edited

Loading

jdconrad commented Nov 11, 2016 •

edited

Loading

jdconrad commented Nov 11, 2016 •

edited

Loading