-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query expression language support #17
Comments
It just occurred to me that in item queries like
This raises a question of what JSON equality is. The basic types are fairly straightforward, but the complex types are more... complex. JSON.org states that arrays are ordered and objects are key/value pairs, but not necessarily ordered. |
oops. |
Also probably needs term grouping, i.e. |
Given that I think it makes sense to confine the scope to the innermost context. This raises another question around how to reference the current item of any intermediate contexts. |
Another option is to omit query expressions (but keep filters). I suspect there's not much consensus around query expressions or use of them for that matter. |
Query expression are what you're calling "filters." What are you suggesting we remove? Without expression support, you can't do something like Also pertains to #16 (comment). You can't have the above if Furthermore, I lead this issue with
His post explicitly calls them expressions, too:
Expression support for these kinds of query is integral to JSON Path. You can't just not have it. Half of your query syntax goes away if you don't. |
I think I've not really looked at "container queries" using script expressions because they were so vaguely defined by Gössner. I think it's a good topic for the Working Group to look into. |
Those equivalencies are correct, but without expression support, you can't parse the I think expressions provide the end user (the path author) a better tool. |
I'm thinking of examples like
We don't know what people will want to do with expressions, and I don't like the idea of limiting our support to the simple examples in Goessner's post. The last example really becomes important if we specify that the location of the value can also be returned. |
For me, filters |
I can't say much about usage in the wild, but container queries can do things like Usage stats probably aren't something we're likely to obtain accurate numbers on, either. |
That example is pretty compelling. Someone out there must be using it... |
A need for expression support in the wild: Get root element using jsonpath based on sub elements condition |
The path that I suggested in that SO question is The reasoning behind this is the requirement that paths in expressions should only return a single value. However To that end, alongside Extending this, it may make sense to have all reserved words carry a parameter list, even if that list is empty. For example, (It looks like the Java implementation does this.) |
The Working Group will need to decide whether to extend the syntax or standardise only what's already in use. |
I've been thinking about the container expression syntax If we take I definitely wouldn't suggest this as a feature for the first draft, but it's something to consider. |
Survey of script expression supportThe Script expression comparison shows some implementations which support script expressions, or container queries, (
*: script expressions are not clearly defined In addition, JMESPath (an alternative to JSONPath) includes an interesting set of built-in functions which are probably candidates for including in script expressions. |
Both of my implementations in dotnet support expressions: Manatee.Json & JsonPath.Net |
Just logging another question about expressions in the wild about testing for the absence of a property. Their go-to attempt was to use a |
This is why we should clearly define the expected support. It should be worded so that implementors may augment the syntax as well (while also providing documentation that such support is non-standard and may not be compatible with other systems). |
Coming from a strongly typed language, this does appeal to me, but I'm biased. Favor language agnosticity. |
I raised #70 to cover the details of regular expressions in filters. |
How do we expect filter working with following JSON input: [{"k": "1.0"}, {"k": 2}] and JSONPaths such as When using [ ] When using [
{
"k" : 2
}
] While goessner when using $[?(@.k == 1)] [
{
"k" : "1.0"
}
] I think that |
I think Goessner is right.
It's supported in implementations like Goessner that use Javascript for a scripting language. But not in Jayway, which implements its own evaluator. It would look foreign to users coming from a strongly typed background. But with numbers, keep in mind that 1.0 and 1 have the same JSON type. I prefer the JMESPath approach, which only has "==", but defines type requirements for comparators such as equality operators. |
I'm leaving here a further comment as a reminder that scope for @ needs to be defined, as discussed in #75. |
early discussion that helped in working out the terms. |
In his post, Goessner indicates that for
[(...)]
("container query") and[?(...)]
("item query"), the contained expression should use "the underlying script engine."This presents a problem for consistency and interoperability between systems. A JSON Path written with a Javascript expression won't work when evaluated using an implementation written in PHP or .Net.
To address this, we should either define that the scripting language is something well-known (e.g. ECMAScript 2015 or C), or we should define our own language (a domain-specific language, or DSL).
Proposal
I like the idea of a simple DSL, and this proposal outlines the rules around such a language.
Exploring the data with
@
JSON elements can be explored using JSON Path within the expression, and the values that are returned can be compared using simple comparison operators.
This alone enables expressions like
?(@.price<10)
and?(@.isbn)
. These state "the path@.price
returns a value and that value is less than 10," and "the path@.isbn
returns a value," respectively. And because these are just JSON Paths, implementations will already have the parsing logic for them. Further, it means that indexer syntax for property names will work, so that?(@['price']<10)
and?(@['isbn'])
will also work.Operators
The basic comparison, mathematical, and boolean operators should be sufficient, at least for the initial revision of the specification. I expect most programmers will be familiar with the C-style operators, so I propose we use those.
I'm open to alternatives, but this gives us a good grounding.
Perhaps we can just use single
=
,&
, and|
instead of the doubles? That might open up a^
for an XOR or a!&
for a NAND. (!|
for NOR is going to be fun to read.)Reserved words
I'd also like to propose that we define a number of reserved properties, like
length
. This enables functionality like in the container query syntax example(@.length-1)
. If the user wants to reference an object property namedlength
, they would need to specifically use the indexer syntax(@['length'] - 1)
which would fetch the value from thelength
property of an object, subtract one, and return the result.Additional reserved words that we may need are open for proposal/discussion.
Open question: Does this enable these reserved keywords outside of the context of a query expression, i.e. does
$.length
give the number of child items of the root value?Backtracking?
There aren't any examples in Goessner's post, but it might be desirable to navigate up the JSON structure to fetch a value to be used in an expression.
For example (with Goessner's example data), suppose I wanted to find the books which cost less than the bike. There isn't a way to get from iterating over the
book
array to outside of thebook
array where thebicycle
is.Alernatively, I could use the root operator
$
to start from the beginning within the expression and do something like$..book[@.price<$..bicycle.price]
, so maybe this isn't explicitly needed for now.Restrictions
For these cases, I think it makes sense to require that these internal paths MUST only return a single value. Returning multiple values should remain an explicitly undefined behavior (allows the implementation to decide how to handle it). (However doing something like
?(@..price.length>4)
to get "are there more than 4 objects that contain aprice
property?" does make sense. The full path still just returns a single value, even though the@..price
portion returns multiple.)The text was updated successfully, but these errors were encountered: