Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Select payload subset without JMESPath #1644

Closed
1 of 2 tasks
rjmackay opened this issue Aug 2, 2023 · 3 comments
Closed
1 of 2 tasks

Feature request: Select payload subset without JMESPath #1644

rjmackay opened this issue Aug 2, 2023 · 3 comments
Assignees
Labels
feature-request This item refers to a feature request for an existing or new utility idempotency This item relates to the Idempotency Utility rejected This is something we will not be working on. At least, not in the measurable future

Comments

@rjmackay
Copy link

rjmackay commented Aug 2, 2023

Use case

Executing jmespath on every request just to select part of the payload seems like overkill. It would be easier to express this selector with a simple function instead and would avoid the overhead of jmespath

Solution/User Experience

Allow passing a function to select the idempotency key instead of using jmespath

The example from the docs could be easily expressed as

const config = new IdempotencyConfig({
  eventKeyFn: payload => payload?.headers?.['X-Idempotency-Key'],
});

Ideally make jmespath an optional dependency too.

Alternative solutions

No response

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

@rjmackay rjmackay added triage This item has not been triaged by a maintainer, please wait feature-request This item refers to a feature request for an existing or new utility labels Aug 2, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 2, 2023

Thanks for opening your first issue here! We'll come back to you as soon as we can.
In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link

@dreamorosi dreamorosi added idempotency This item relates to the Idempotency Utility discussing The issue needs to be discussed, elaborated, or refined and removed triage This item has not been triaged by a maintainer, please wait labels Aug 2, 2023
@dreamorosi dreamorosi self-assigned this Aug 2, 2023
@dreamorosi
Copy link
Contributor

dreamorosi commented Aug 2, 2023

Hi @rjmackay thank you for taking the time to open this issue.

We have opted for using JMESPath expressions because it’s one of the standard query languages for JSON structures and believe that it can cover both simpler and more complex use cases effectively and efficiently.

Most JMESPath implementations including the one we are currently using (jmespath), the one you suggested in the linked issue (#1645), and others from other languages like Python, all follow a similar call order that allows to cache operations and speed up repeated expressions.

For example, here’s a diagram of how JMESPath implementations evaluate a request:

flowchart LR
    A[Take expression] --> B{Has cached expression}
    B -->|Yes| C[Apply expression]
    B -->|No| D[Create AST for expression]
    D --> C
    C --> E[Return result]
Loading

Whenever you provide a JMESPath expression (i.e. foo.bar) and a payload (i.e. the request), the module creates an AST (Abstract Syntax Tree) of the expression, which is then used to visit the payload using an implementation of a Pratt Parser (aka top down precedence). The AST generated by an expression is then cached in memory so that all subsequent evaluations can reuse the AST and only need to actually extract the data.

On top of the above, many implementations allow you to also parse an expression beforehand so that the corresponding AST is already in the cache when the first request is parsed. Doing this outside of the Lambda handler would ensure that this is taken care of during the function's initialization.

With these considerations in mind, the main difference in terms of operations between using a JMESPath library vs bringing your own function resides purely on the respective implementations.

As mentioned earlier JMESPath uses the abstract tree created from the expression to visit the payload and extract the corresponding data.

Let’s take a high level look at how the parser works, and how the object is visited using your example and assuming a payload that looks like this:

{
  "headers": {
    "X-Idempotency-Key": "foo"
  }
}

With this payload the corresponding JMESPath expression to extract the header you want would be headers."X-Idempotency-Key", which once parsed, produces this AST:

{
  "type": "subexpression",
  "children": [
    {
      "type": "field",
      "children": [],
      "value": "headers"
    },
    {
      "type": "field",
      "children": [],
      "value": "X-Idempotency-Key"
    }
  ]
}

Using this abstract tree, the module visits the payload and extracts the data in almost the same way that you described, which corresponds to the following (simplified) pseudo execution stack:

1. starting payload 
{
  "headers": {
    "X-Idempotency-Key": "foo"
  }
}

2. apply field.headers expression
{
  "X-Idempotency-Key": "foo"
}

3. apply field."X-Idempotency-Key" expression:
"foo"

You can see that this is the case in all the implementations I mentioned:

  • jmespath (JS) - see here
  • jmespath-ts (JS) - see here
  • jmespath (Python) - see here

So to sum up: even for simple cases like the one you describe the two implementations are equivalent and if performance on the first request is a concern, you can pre-compile an expression so that the AST is generated during the initialization phase.

Having the support of an expressive query language like JMESPath however also allows you to easily adapt to different use cases as your workload evolves.

While it’s true that for a simple field extraction an arrow function might be tempting, if your payload becomes more complex, or you want to query data in a more involved way like extracting multiple fields or applying logical operators then you don’t have to reimplement your own parsing function and you can just write a JMESPath expression that will do all that for you.

I hope this clarifies why we stand behind the choice of using JMESPath.

@dreamorosi dreamorosi closed this as not planned Won't fix, can't repro, duplicate, stale Feb 21, 2024
@dreamorosi dreamorosi added rejected This is something we will not be working on. At least, not in the measurable future and removed discussing The issue needs to be discussed, elaborated, or refined labels Feb 21, 2024
@dreamorosi dreamorosi moved this from Coming soon to Closed in Powertools for AWS Lambda (TypeScript) Feb 21, 2024
Copy link
Contributor

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request This item refers to a feature request for an existing or new utility idempotency This item relates to the Idempotency Utility rejected This is something we will not be working on. At least, not in the measurable future
Projects
Development

No branches or pull requests

2 participants