-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JEP-11 Lexical Scoping #32
Conversation
I have an implementation in python: Also, I think I still have that script to convert test json files to yaml. Do you care to write up the rest of the yaml? |
Sure. And I will add a reference to the pull request in the test repository. |
96b1d5c
to
427423b
Compare
I converted the json to yaml, stubbing the let.yml file. I have also rebased the branch off of #69 |
427423b
to
2f2cd86
Compare
Thanks you for the legwork. I did not find how to have at least the first example displayed, before the I noticed the examples for function I also have pushed a pull request to use monospace font instead in an attempt at making it easier to read. |
The examples will display before "show all" if the input data is shorter than 60 characters. We might need to add some trivial examples just for that purpose. |
3779239
to
00691ef
Compare
Lexical Scoping
Abstract
This JEP proposes a new function
let()
(originally proposed by MichaelDowling) that allows for evaluating an expression with an explicitly defined
lexical scope. This will require some changes to the lookup semantics in
JMESPath to introduce scoping, but provides useful functionality such as being
able to refer to elements defined outside of the current scope used to evaluate
an expression.
Motivation
As a JMESPath expression is being evaluated, the current element, which can be
explicitly referred to via the
@
token, changes as expressions areevaluated. Given a simple sub expression such as
foo.bar
, first thefoo
expression is evaluted with the starting input JSON document, and theresult of that expression is then used as the current element when the
bar
element is evaluted. Conceptually we’re taking some object, and narrowing down
its current element as the expression is evaluted.
Once we’ve drilled down to a specific current element, there is no way, in the
context of the currently evaluated expression, to refer to any elements outside
of that element. One scenario where this is problematic is being able to refer
to a parent element.
For example, suppose we had this data:
Let’s say we wanted to get the list of cities of the state corresponding to our
first_choice
key. We’ll make the assumption that the state names areunique in the
states
list. This is currently not possible with JMESPath.In this example we can hard code the state
WA
:but it is not possible to base this on a value of
first_choice
, whichcomes from the parent element. This JEP proposes a solution that makes
this possible in JMESPath.
Specification
There are two components to this JEP, a new function,
let()
, and a changeto the way that identifiers are resolved.
The let() Function
The
let()
function is heavily inspired from thelet
function commonlyseen in the Lisp family of languages:
https://clojuredocs.org/clojure.core/let
http://docs.racket-lang.org/guide/let.html
The let function is defined as follows:
let
is a function that takes two arguments. The first argument is a JSONobject. This hash defines the names and their corresponding values that will
be accessible to the expression specified in the second argument. The second
argument is an expression reference that will be evaluated.
Resolving Identifiers
Prior to this JEP, identifiers are resolved by consulting the current context
in which the expression is evaluted. For example, using the same
search
function as defined in the JMESPath specification, theevaluation of:
will result in the
foo
identifier being resolved in the context ofthe input object
{"foo": "a", "bar": "b"}
. The context object definesfoo
asa
, which results in the identifierfoo
being resolved asa
.In the case of a sub expression, where the current evaluation context
changes once the left hand side of the sub expression is evaluted:
The identifier
b
is resolved with a current context of{"b": "y"}
, which results in a value ofy
.This JEP adds an additional step to resolving identifiers. In addition
to the implicit evaluation context that changes based on the result
of continually evaluating expressions, the
let()
command allowsfor additional contexts to be specified, which we refer to by the common
name scope. The steps for resolving an identifier are:
Attempt to lookup the identifier in the current evaluation context.
If this identifier is not resolved, look up the value in the current
scope provided by the user.
If the idenfitier is not resolved and there is a parent scope, attempt
to resolve the identifier in the parent scope. Continue doing this until
there is no parent scope, in which case, if the identifier has not been
resolved, the identifier is resolved as
null
.Parent scopes are created by nested
let()
calls.Below are a few examples to make this more clear. First, let’s
examine the case where the identifier can be resolved from the
current evaluation context:
In this scenario, we are evaluating the expression
b
, with thecontext object of
{"b": "y"}
. Hereb
has a value ofy
,so the result of this function is
y
.Now let’s look at an example where an identifier is resolved from
a scope object provided via
let()
:Here, we’re trying to resolve the
a
identifier. The currentevaluation context,
{"b": "y"}
, does not definea
. Normally,this would result in the identifier being resolved as
null
:However, we now fall back to looking in the provided scope object
{"a": "x"}
, which was provided as the first argument tolet
. Note here thatthe value of
a
has a value of"x"
, so the identifier is resolved as"x"
, and the return value of thelet()
function is"x"
.Finally, let’s look at an example of parent scopes. Consider the
following expression:
Here we have nested let calls, and the expression we are trying to
evaluate is the multiselect hash
{a: a, b: b, c: c}
. Thec
identifier comes from the evaluation context{"c": "z"}
.The
b
identifier comes from the scope object in the secondlet
call:
{b: \
y`}. And finally, here’s the lookup process for the
a` identifier:Is
a
defined in the current evaluation context? No.Is
a
defined in the scope provided by the user? No.Is there a parent scope? Yes
Does the parent scope,
{a: \
x`}, define
a? Yes,
ahas the value of
"x", so
ais resolved as the string
"x"`.Current Node Evaluation
While the JMESPath specification defines how the current node is determined,
it is worth explicitly calling out how this works with the
let()
functionand expression references. Consider the following expression:
Given the input data:
When the expression
c
is evaluated, the current evaluation context is{"c": "foo"}
. This is because this expression isn’t evaluated untilthe second
let()
call evaluates the expression, which does notoccur until the first
let()
function evaluates the expression.Motivating Example
With these changes defined, the expression in the “Motivation” section can be
be written as:
Which evalutes to
["Seattle", "Bellevue", "Olympia"]
.Rationale
If we just consider the feature of being able to refer to a parent element,
this approach is not the only way to accomplish this. We could also allow
for explicit references using a specific token, say
$
.The original example in the “Motivation” section would be:
While this could work, this has a number of downsides, the biggest one being
that you’ll need to always keep track of the parent element. You don’t know
ahead of time if you’re going to need the parent element, so you’ll always need
to track this value. It also doesn’t handle nested lexical scopes. What if
you wanted to access a value in the grand parent element? Requiring an
explicit binding approach via
let()
handles both these cases, and doesn’trequire having to track parent elements. You only need to track additional
scope when
let()
is used.Implementation Survey
C#
JMESPath.NET implements this proposal.
To this end, the project authors had to introduce a new abstraction to the AST object that implements function calls.
This abstraction is actually only used for the implementation of the
let()
function itself.The
IContextEvaluator
abstraction encapsulates context evaluation logic required to extract the proper valuefrom the stack of lexical scopes. The implementation follows the specification requirements:
The lexical scope stack contains a series of JSON objects referred to by the
JToken
type in C#.When evaluating a JMESPath expression,
identifier
expressions are evaluated. That’s where scopeevaluation must take place.
When evaluating an
identifier
against the current JSON object, the implementation first uses the current contextwhich is specified as an argument of the corresponding expression. If the
identifier
does not refer to an existingvalue, the
identifier
switches to using theIContextEvaluator
abstraction referred to above to find the requiredvalue out of the stack of lexical scopes.
No dependency were required to implement this JEP.
Other languages
Given that most object-oriented languages support the concept of abstractions via interfaces (or prototypes) and that
an expected implementation would map grammar constructs to some form of AST, it seems reasonable to believe that a
similar implementation as the one shown here for C# could be achieved with the following languages:
Although I have no experience on other languages, there is no reason to believe it would be any different or even harder
than the simple implementation shown here.