Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toJson, jsonTo, json (de)serialization for custom types; remove dependency on strtabs thanks to a hooking mechanism #14563

Merged
merged 14 commits into from
Jun 8, 2020

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Jun 5, 2020

  • toJson, jsonTo, for json (de)serialization from custom types; all reasonable types are supported recursively.
    object,ref,ptr,pointer,tuple,numbers,string,seq,array,ordinals,distinct,tables
  • much simpler and compact implementation compared to the other things in json (detectIncompatibleType,assignObjectImpl,initFromJson etc)
    => all self contained in 1 single proc for toJson and fromJson; no need for forward declaration or any boilerplate
  • provide a hooking mechanism to allow (de) serialization from custom types (eg strtabs.StringTable)
  • remove dependency on strtabs (and its dependents strtabs,os,pathnorm,osseps,posix,times) thanks to hooking mechanism (dependency was added recently in Improve JSON serialisation of strtabs #14549 and was bad as argued here Improve JSON serialisation of strtabs #14549 (comment) )

example

it now works with roundtrip serialization/deserialization: see tests/stdlib/tjsonmacro.nim

    let a = (1.1, "fo", 'x', @[10,11], [true, false], [t,newStringTable()], [0'u8,3'u8], (foo: 0.5'f32, bar: A(a1: "abc"), bar2: A.default))
    testRoundtrip(a):
      """[1.1,"fo",120,[10,11],[true,false],[{"mode":"modeCaseSensitive","table":{"y":"Y","z":"Z"}},{"mode":"modeCaseSensitive","table":{}}],[0,3],{"foo":0.5,"bar":{"a1":"abc"},"bar2":null}]"""

future work

  • remove json=>options dependency using same hooking approach, and perhaps other

@PMunch
Copy link
Contributor

PMunch commented Jun 5, 2020

This wouldn't really work as it is now. Take the strtabs example, with this simple serialisation routine you won't be able to have a key called "mode" as that name is taken up by the strtabs mode field. It is of course possible to build an ever more complex serialisation routine structure, but at some point it would've been easier to just have it import the JSON module and do it manually (apart from that this also would allow serialisation to other formats).

@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2020

technically, duplicate keys are valid in json however, so the format i used would still allow for a mode key (https://stackoverflow.com/questions/21832701/does-json-syntax-allow-duplicate-keys-in-an-object)

but as I mentioned in https://github.com/nim-lang/Nim/pull/14563/files#diff-62bc00948f1e69981aabd6de006e25fbR426
you can adapt the code to produce this:

"""{"mode":"modeCaseSensitive","table":{"city":"Monaco","name":"John"}}"""

@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2020

there, it now serializes as

"""{"mode":"modeCaseSensitive","table":{"city":"Monaco","name":"John"}}"""

and it technically is json agnostic, eg would work just as well with bson/protobuf/msgpack

@PMunch
Copy link
Contributor

PMunch commented Jun 5, 2020

This is what I was talking about when I said "It is of course possible to build an ever more complex serialisation routine structure, but at some point it would've been easier to just have it import the JSON module and do it manually". This still doesn't handle arrays for example, adding that would make it even more complex. I do like the fact that this is JSON agnostic, but I feel that building a sufficiently complex serialisation structure might tie it to hard to JSON and JSON-esque formats to really be viable for any other usages. That is just a hunch though..

@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2020

just have it import the JSON module and do it manually"

that dependency would be bad, especially since json imports a lot itself.

This still doesn't handle arrays for example, adding that would make it even more complex

That argument is invalid, array support is not needed for strtabs, it's only needed for toJson to be feature complete so it handles arbitrary types (yes, including tuple[a: seq[StringTableRef], b: float]). I didn't included it because it's straightforward but I've implemented it in my own code, this was just to illustrate how you implement this 2-body serialization. Of course array, tuple, seq would be needed in a finished PR.

@PMunch
Copy link
Contributor

PMunch commented Jun 5, 2020

I know the import would be bad, that's the whole point. But building all the same functionality into a serialisation procedure structure is very messy.

The argument isn't invalid, I wasn't saying it was required for strtabs. I was saying that as a general approach this would have to be expanded into something that could easily become too complex and unwieldy for common use. I'd love to be proven wrong however, so please show an implementation of this, preferably with usage examples for various types around the stdlib to gauge how simple/complex this really is.

Don't get me wrong, I like the idea of this, but I feel that as a feature it still needs a bit of work in order to prove itself.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 5, 2020

the only complex part is the custom hook if you wanted to support custom types other than strtabs (until we have symbol alias, then it'll be simple too), the rest is straightforward recursion.
here, I made it more feature complete so that this works:

proc testToJson() =
  var t = {"z": "Z", "y": "Y"}.newStringTable
  type A = ref object
    a1: string
  let a = (1.1, "fo", 'x', @[10,11], [true, false], [t,newStringTable()], (foo: 0.5'f32, bar: A(a1: "abc"), bar2: A.default))
  let j = a.toJson
  doAssert $j == """[1.1,"fo",120,[10,11],[true,false],[{"mode":"modeCaseSensitive","table":{"y":"Y","z":"Z"}},{"mode":"modeCaseSensitive","table":{}}],{"foo":0.5,"bar":{"a1":"abc"},"bar2":null}]"""

it's already handling most cases (and the reverse direction (parsing T from JsonNode) is equally doable, I've implemented it in my own code.

A full featured solution would work like that by default but would also allow user to customize behavior for things like ptr-like types (eg whether to serialize the dereference or the address), and handle cycles correctly (to avoid infinite recursion). Likewise, it's not hard and I've implemented this.

@PMunch
Copy link
Contributor

PMunch commented Jun 5, 2020

This is looking better and better. Not loving the declaration of serialise with a single argument to "register" the serialiser, but not really sure how to otherwise implement that in an easy way.

Not quite sure what you mean by "the only complex part is the custom hook if you wanted to support custom types other than strtabs", wouldn't this simply be the same when compiles(serialize(t)): serialise(t, fun, fun1, fun2)?

Also the different funs should have some more informative names, but that's just nit-pick.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 6, 2020

PTAL.
I've found a better way using mixin.
I've reverted #14549 to avoid the issues mentioned in #14549 (comment)
what's missing is the reverse direction, but can be done with similar approach as this PR via mixin => DONE

Not loving the declaration of serialise with a single argument to "register" the serialiser

fixed

Not quite sure what you mean by "the only complex part is the custom hook if you wanted to support custom types other than strtabs", wouldn't this simply be the same when compiles(serialize(t)): serialise(t, fun, fun1, fun2)?

found a better way using mixin

Also the different funs should have some more informative names, but that's just nit-pick.

not relevant anymore

EDIT: now allows roundtrip, see updated top post

@timotheecour timotheecour marked this pull request as ready for review June 6, 2020 02:36
@timotheecour timotheecour changed the title json custom serialization; application for strtabs toJson, jsonTo, json (de)serialization from custom types; remove dependency on strtabs, roundtrip serialization for complex recursive types, tuples etc Jun 6, 2020
@timotheecour timotheecour changed the title toJson, jsonTo, json (de)serialization from custom types; remove dependency on strtabs, roundtrip serialization for complex recursive types, tuples etc toJson, jsonTo, json (de)serialization for custom types; remove dependency on strtabs thanks to a hooking mechanism Jun 6, 2020
@timotheecour timotheecour requested a review from Araq June 6, 2020 07:56
lib/pure/json.nim Outdated Show resolved Hide resolved
lib/pure/json.nim Outdated Show resolved Hide resolved
lib/pure/json.nim Outdated Show resolved Hide resolved
lib/pure/json.nim Outdated Show resolved Hide resolved
@timotheecour
Copy link
Member Author

PTAL, you can just look at last few commits; added testing for js + vm

@Araq
Copy link
Member

Araq commented Jun 8, 2020

Superb work!

@Araq Araq merged commit c7a1a7b into nim-lang:devel Jun 8, 2020
@timotheecour timotheecour deleted the exp_json_strtabs branch June 8, 2020 08:44
@PMunch
Copy link
Contributor

PMunch commented Jun 8, 2020

Hmm, I went back to NimBot to try and implement it with these changes and noticed that it only works when you call toJson and not with %, is this on purpose? If so, why?

@timotheecour
Copy link
Member Author

timotheecour commented Jun 8, 2020

see long answer here: #14563 (comment); using toJson is the way to go

@PMunch
Copy link
Contributor

PMunch commented Jun 8, 2020

Still not quite sure why though? Just backwards compatibility? % has been Nims "to JSON" operator for quite a while, it feels weird to replace it with toJson.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 8, 2020

several reasons as mentioned:

  • no breaking change: code that uses % will keep working; but % and %* IMO are the ones that feel weird; it's a non-standard notation only found in nim and clashes with several other modules (jsffi, json, ropes, strtabs, strutils all use %, with different meanings), most other languages usually use json in the name to convert to/from json; short descriptive names are self-documenting instead of cryptic nonstandard operators
  • avoid further bloating std/json which is already more bloated than it should; note that toJson/fromJson may get further enhancements in future as described in this PR
  • toJson/fromJson uses a much simpler + DRY implementation (which also means: easier to adapt, extend etc), supports (de)serialization with more types (tuple, distinct, char, ptr, pointer etc), custom hooks and in-place deserialization for efficiency

maybe %, %*, to could in future be re-implemented in terms of toJson using a bit of refactoring (or moving the "core json" to a separate std/jsoncore) but not sure it's worth it; one thing that would be worth would be removing std/options dependency from std/json as noted in PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants