-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Remove internal parser API used by atdgen #151
Comments
We would first need to modify atdgen's code generation and see how it goes in practice. I'm rather optimistic. Then, we'd have to run benchmarks and poll the users to see if they can accept the new performance. |
XML world has a notion of Simple API for XML (SAX). SAX parsers are event-based, they parse the XML format and let the upper level know that we've encountered a node or an attribute, but without constructing any trees in memory. Possibly something similar could be done with Yojson, events could be lower level than |
Hm, if we have that already, why can't Yojson be built on top of |
|
Yojson is pretty popular in the ecosystem, would be awesome to expose streaming mode and ast mode that is built on top of the streaming mode to offer full stack experience for JSON parsing/serialization. Some kind of streaming mode is already in place and is secretly used by atdgen, no?
As atd user, I find it handy to have ast at times - when I want some opaque json data as part of my record to be process manually, or when I want to serialize atd type specifically to ast (currently this is not possible and we have to resort to hack like part json string to ast with yojson, acceptable in tests but still ugly). But maybe using jsonm for parsing/serialization for performance reasons does not block from getting ast representation when requested by the caller with associated performance penalty. @mjambon what do you think about this? |
Yes, but if you need streaming mode, why not just use
Yes, but it uses the internal representation of Yojson and thus would break were Yojson to change things in there. Which is not a good way to design an API that is supposed to be used by other programmers unless you want to perpetually have people fix their code or never change the implementation ever again. Note that the streaming mode of With regard to the AST representation: in ppx_deriving_yojson you specify your types and the deriver tries to unpack |
Makes sense overall. Folowing this logic atdgen should not use Yojson for parsing in streaming mode, but should use
I fully agree. Just though that if that secret api is like 95% of somewhat usable public streaming-like API, why not bite the bullet :) |
Hi. Sorry for chiming in a little late. I started considering a replacement for atdgen called atdml. Check out the currently proposed list of features for details. The aspects that concern Yojson are:
|
Current state & Rationale
Yojson currently has a semi-secret undocumented API that allows controlling the parser - when you know what the structure of the JSON is you can tell the parser what you expect and it will give you the values in a somewhat more streaming way.
The downside is that this exposes the internals of Yojson which is not great because this locks us into our current implementation. Another issue is that this code is under-used and most importantly under-tested, so improvements sometimes can break code and it doesn't get detected before the release.
Proposal
The proposal is to remove the internal code. Initially this can be just hidden from the interface and then we can shake down and rewrite the unused code.
A possible replacement for the API could be to use the
Yojson.Safe.t
API. Given there are a number of consumers that convert a predefined structure from JSON to OCaml (like atdgen, ppx_deriving_yojson, ppx_yojson_conv) we could consider adding some optimized API that avoids creating a full AST but rather allows libraries to get JSON to parse with lower overhead while not exposing the internals too much.Impact
This is under-researched at the moment, but the major consumer of this API is atdgen. mjambon suggests that atdgen could be changed to use the
Yojson.Safe.t
AST.Roadmap
This is another breaking change, so would need a dependency bump. Given a lot of code uses atdgen, a number of packages would be impacted, but once atdgen is adjusted it could generate new code that would Just Work(TM).
The text was updated successfully, but these errors were encountered: