-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSONPath API #992
Comments
I'll add an As far as I can see, JSONPath isn't an actual standard, it's just described as a post on some guy's blog in 2007. XPath is way more legit - it has RFCs, a wikipedia page, 3 different standardized versions, etc... From the 3 libraries you linked above, the first one's README says that it follows "the majority rules in http://goessner.net/articles/JsonPath/ but also with some minor differences" and the last one has a "Differences from Original Implementation" section. The middle one looks robust, but its readme is just a few sentences, so I can't know how strictly it follows the "standard". I'm mentioning this since it's not immediately clear to me what the benefits over the currently used GJSON library are. The second reason I want to further evaluate this before we implement it is that it will be available as a standalone functionality, not just for HTTP responses. Thus, when implementing it, we'll likely want to do it with support for execution segments in mind, so that users would be able to operate on large JSON files in a distributed manner. Consequently, the JSONPath implementation we'll use would have to support streaming JSON parsing, i.e. being able to work and resolve queries without having the whole JSON file parsed or in memory at once. Neither GJSON, nor the two libraries you linked seem to support this... The Go standard library handles streaming via (I still haven't created an actual github issue for execution segments from that wall of text in the Slack chat, I'll link it here when I do so) |
Here's another Go library that, although not exactly streaming, seems to support jsonpath (like the two @robingustafsson linked), but requires only btw it also has a Limitations and deviations section in the README 😄 |
Right, JSONPath isn't a formalized standard 🙂 Regarding the need for streaming, I'm thinking that's a later iteration. So agree, we need to take that future direction into account, but I'd argue we don't need it from the start. #532 is more important to implement IMO, to make sure we only load the file once into memory. With "execution segments" are you thinking we should implement support for data partitioning in a first iteration or just keep it in mind for future extendability as with streaming? |
Seems like I have to convert that slack wall-of-text to an actual execution segments github issue sooner rather than later 😄 To answer the question: no, data partitioning will come as a second iteration, simply because execution segment support in the new schedulers will be done very soon (™® 😊 ), while we haven't yet started working on shared memory/data partitioning/data "streaming"/etc. My desire to use a streaming JSON parser in the beginning is mainly due to the fact that all of those JSONPath parsers are slightly different, each with their own quirks and "limitations and deviations" from the original "spec". So if we start using one of those in the beginning and we have to switch to another when we add data partitioning, we'd be introducing unnecessary backwards incompatibility. And during my work on k6, I've developed an aversion to writing code that I know we're going to throw away in the future. It will require more effort initially, but it will also save us from from the annoyance of spending tons of time refactoring these things and working around old user-facing APIs before every new feature that we want to add...
Agree, though I'm hesitant to go ahead with #532, knowing the limitations of the approach and its UX problems when a load test is executed on multiple k6 instances. If we already "know" a way to avoid those issues and get the same benefits, regardless if the test is executed on one or more machines (i.e. execution segments), shouldn't we proceed directly with that? If we go ahead with #532 without execution segments, it will take shorter, but when we decide to add proper data partitioning, it will take much longer since we'll have to work around the old code, the API will likely be inconsistent, and there might be breaking changes... |
I too agree that this should be done after we evaluate the libraries in question and we should get the streaming working from the get go. As I would not like to explain to everybody why around jsonpath which was different from all others jsonpaths previously is now different from itself as well ... 👍 to the posting of the execution segments proposal I would also would like to see something like https://github.com/dchester/jsonpath tested first and benchmarked it against the future k6 internal implementation (This also have subtle differences but reading through them all of them seem to be okay and completely sane and for the best)
🤣 |
👍 ok, let's evaluate our options in regards to streaming, data partitioning and #532 before we start implementing this. |
Added the execution segments mega-issue: #997 |
@robingustafsson, here I've described some of the issues that we need to solve before we implement the JSONPath feature: #1021 (comment) I think all of those apply to the JSONPath implementation as well, plus the fact that with JSONPath we also have data filtering and counting, operations that also have to be considered through the prism of segmentation and data sharing and partitioning. |
We discussed the topic with the maintenance team as part of our effort to make our backlog of issues more consistent. There is no apparent demand to switch to JSONAPI, and as k6 already supports GJSON, we've decided to close this issue. We might come back to it later if there's demand for it, or if the team judges the switch would serve a better developer-experience. |
k6 currently has the
http.Response.json(selector)
API to extract individual pieces of a JSON structure. This "selector" API is not JSONPath, but uses a similar syntax and is based on the GJSON Go library.I propose we switch this API to be JSONPath compatible using a library like [1] or [2].
We should also expose the JSONPath API as a standalone API, not tied to the HTTP response object, with an API like [3]:
The API should support the full JSONPath syntax, including filter expressions. The following APIs should be implemented:
jsonpath.query(objOrStr, pathExpression[, count])
[1] - https://github.com/oliveagle/jsonpath
[2] - https://github.com/PaesslerAG/jsonpath
[3] - https://www.npmjs.com/package/jsonpath
The text was updated successfully, but these errors were encountered: