Feature request: Select payload subset without JMESPath #1644

rjmackay · 2023-08-02T02:56:45Z

Use case

Executing jmespath on every request just to select part of the payload seems like overkill. It would be easier to express this selector with a simple function instead and would avoid the overhead of jmespath

Solution/User Experience

Allow passing a function to select the idempotency key instead of using jmespath

The example from the docs could be easily expressed as

const config = new IdempotencyConfig({
  eventKeyFn: payload => payload?.headers?.['X-Idempotency-Key'],
});

Ideally make jmespath an optional dependency too.

Alternative solutions

No response

Acknowledgment

This feature request meets Powertools for AWS Lambda (TypeScript) Tenets
Should this be considered in other Powertools for AWS Lambda languages? i.e. Python, Java, and .NET

Future readers

Please react with 👍 and your use case to help us understand customer demand.

The text was updated successfully, but these errors were encountered:

boring-cyborg · 2023-08-02T02:56:47Z

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link

dreamorosi · 2023-08-02T11:47:46Z

Hi @rjmackay thank you for taking the time to open this issue.

We have opted for using JMESPath expressions because it’s one of the standard query languages for JSON structures and believe that it can cover both simpler and more complex use cases effectively and efficiently.

Most JMESPath implementations including the one we are currently using (jmespath), the one you suggested in the linked issue (#1645), and others from other languages like Python, all follow a similar call order that allows to cache operations and speed up repeated expressions.

For example, here’s a diagram of how JMESPath implementations evaluate a request:

flowchart LR
    A[Take expression] --> B{Has cached expression}
    B -->|Yes| C[Apply expression]
    B -->|No| D[Create AST for expression]
    D --> C
    C --> E[Return result]

Whenever you provide a JMESPath expression (i.e. foo.bar) and a payload (i.e. the request), the module creates an AST (Abstract Syntax Tree) of the expression, which is then used to visit the payload using an implementation of a Pratt Parser (aka top down precedence). The AST generated by an expression is then cached in memory so that all subsequent evaluations can reuse the AST and only need to actually extract the data.

On top of the above, many implementations allow you to also parse an expression beforehand so that the corresponding AST is already in the cache when the first request is parsed. Doing this outside of the Lambda handler would ensure that this is taken care of during the function's initialization.

With these considerations in mind, the main difference in terms of operations between using a JMESPath library vs bringing your own function resides purely on the respective implementations.

As mentioned earlier JMESPath uses the abstract tree created from the expression to visit the payload and extract the corresponding data.

Let’s take a high level look at how the parser works, and how the object is visited using your example and assuming a payload that looks like this:

{
  "headers": {
    "X-Idempotency-Key": "foo"
  }
}

With this payload the corresponding JMESPath expression to extract the header you want would be headers."X-Idempotency-Key", which once parsed, produces this AST:

{
  "type": "subexpression",
  "children": [
    {
      "type": "field",
      "children": [],
      "value": "headers"
    },
    {
      "type": "field",
      "children": [],
      "value": "X-Idempotency-Key"
    }
  ]
}

Using this abstract tree, the module visits the payload and extracts the data in almost the same way that you described, which corresponds to the following (simplified) pseudo execution stack:

1. starting payload 
{
  "headers": {
    "X-Idempotency-Key": "foo"
  }
}

2. apply field.headers expression
{
  "X-Idempotency-Key": "foo"
}

3. apply field."X-Idempotency-Key" expression:
"foo"

You can see that this is the case in all the implementations I mentioned:

jmespath (JS) - see here
jmespath-ts (JS) - see here
jmespath (Python) - see here

So to sum up: even for simple cases like the one you describe the two implementations are equivalent and if performance on the first request is a concern, you can pre-compile an expression so that the AST is generated during the initialization phase.

Having the support of an expressive query language like JMESPath however also allows you to easily adapt to different use cases as your workload evolves.

While it’s true that for a simple field extraction an arrow function might be tempting, if your payload becomes more complex, or you want to query data in a more involved way like extracting multiple fields or applying logical operators then you don’t have to reimplement your own parsing function and you can just write a JMESPath expression that will do all that for you.

I hope this clarifies why we stand behind the choice of using JMESPath.

github-actions · 2024-02-21T11:45:41Z

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

rjmackay added triage This item has not been triaged by a maintainer, please wait feature-request This item refers to a feature request for an existing or new utility labels Aug 2, 2023

github-project-automation bot added this to Powertools for AWS Lambda (TypeScript) Aug 2, 2023

rjmackay mentioned this issue Aug 2, 2023

Maintenance: jmespath utility #1645

Closed

2 tasks

dreamorosi moved this to Ideas in Powertools for AWS Lambda (TypeScript) Aug 2, 2023

dreamorosi added idempotency This item relates to the Idempotency Utility discussing The issue needs to be discussed, elaborated, or refined and removed triage This item has not been triaged by a maintainer, please wait labels Aug 2, 2023

dreamorosi self-assigned this Aug 2, 2023

dreamorosi closed this as not planned Won't fix, can't repro, duplicate, stale Feb 21, 2024

github-project-automation bot moved this from Ideas to Coming soon in Powertools for AWS Lambda (TypeScript) Feb 21, 2024

dreamorosi added rejected This is something we will not be working on. At least, not in the measurable future and removed discussing The issue needs to be discussed, elaborated, or refined labels Feb 21, 2024

dreamorosi moved this from Coming soon to Closed in Powertools for AWS Lambda (TypeScript) Feb 21, 2024

dreamorosi mentioned this issue Jul 30, 2024

Feature request: set correlation ID in Logger #2863

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Select payload subset without JMESPath #1644

Feature request: Select payload subset without JMESPath #1644

rjmackay commented Aug 2, 2023

boring-cyborg bot commented Aug 2, 2023

dreamorosi commented Aug 2, 2023 •

edited

Loading

github-actions bot commented Feb 21, 2024

Feature request: Select payload subset without JMESPath #1644

Feature request: Select payload subset without JMESPath #1644

Comments

rjmackay commented Aug 2, 2023

Use case

Solution/User Experience

Alternative solutions

Acknowledgment

Future readers

boring-cyborg bot commented Aug 2, 2023

dreamorosi commented Aug 2, 2023 • edited Loading

github-actions bot commented Feb 21, 2024

dreamorosi commented Aug 2, 2023 •

edited

Loading