-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kotlin serialization compiler plugin #149
Comments
This is more of a meta note and I don't want to derail comments on this issue (and as a relatively satisfied consumer of multiplatform serialization), but this problem needs a better general solution. Not every dynamic programming requirement should have to be solved by a first-party compiler plugin and IDE plugin. This use case is a perfect example of something that shouldn't have to be baked into Kotlin and it's a failure of the language tooling that it currently must be. |
@JakeWharton I absolutely agree! Moreover, let me add here that we have a general idea on how this can be done. The full description of this imagined mechanism is out of the scope of this concrete serialization plugin proposal, but the name of the idea already tells a lot -- we call it "compile-time reflection". The basic idea is that you should be able to perform reflection on the static structure of your code during compile time from a regular Kotlin code. This is somewhat similar to the proposed meta-programming facilities for C++, but I believe that we can do it in a much more type-safe and way more toolable fashion by leveraging the concept of Kotlin inline functions. You can think of it as "inline functions on steroids". The key observation here is that you can already do serialization via run-time reflection on Kotlin/JVM. Doing it in compile-time is just a performance optimization. A sufficiently advanced compiler can figure out that the static structure of the code (like a list of class members) does not change at run-time and perform an advanced "constant propagation" and "loop unrolling" during compile-time to turn a code that is written using run-time reflection into a code that is fully templated for each specific class. (P.S. but first we need to finish house-keeping and switch all Kotlin compiler backends to an IR-based implementation to enable this kind of stuff) |
As my friend points out, reflection is defined as "The ability of a computer program to examine, introspect and modify its own structure and behavior at runtime". Also, this sounds a lot like some kind of macro functionality, like Scala has? I don't know how they work, but just asking if it's similar. It sounds interesting. |
@Dico200 We don't plan any macros in Kotlin. The idea is that you should write code using runtime reflection API (and you can debug it as such), but since the structure of the program is statically known at compile-time, the corresponding values (property names, etc) can be inlined during compile-time to the point where there are no run-time reflexive calls left for the run-time itself. |
@Dico200 If I understand @elizarov correctly, here is my interpretation of what's going on. There are already two different levels of code generation in the compiler:
What @elizarov is proposing is to extend this compilation intelligence even further: make the compiler understand reflection code and replace it with equivalent, but static, code. |
That is true. But I really curious how should real code look when you need serialization (for example for request/response parsing) and parcelable (to save a screen instance state or pass this data between Android components. It should be a very common case. Annotate with @parcelize and with @serializable at the same time? Should probably work even now. Even if 2 annotations will work would be good to have some integrated solution, that at least do not generate very similar bytecode, or even allow generate Parcelable implementation for classes with @serializable same as @parcelize (maybe with compiler plugin flag) |
@gildor Serialization plugin generates code which is abstract over a storage; one may write the implementation of |
That's a nice and simplified explanation. Thank you cbeust, elizarov. |
@sandwwraith Yes, that exactly what I meant, that it's pretty straightforward to write encoder for Parcel.
Yes, this is can be a problem sometimes, for example, you cannot use such class without Parcelable implementation as a property of another parcelable class.
I don't think that it's a big problem in general, also, probably, can be solved with meta-annotation to have one annotation for both use cases. I just would like to have some integrated solution and the same generated bytecode. |
Implementing If the private methods are generated for the |
@LouisCAD
Yes, but cannot do that for classes annotated with |
Is this necessary for interop with some existing Android APIs? Because that should not be a problem in most use cases – just use everywhere @LouisCAD , |
Scanned the document and couldn't find this (but maybe I missed how to do it other way): List all available KClass with Use-case: Having a real-time communication client with arbitrary messages. For example: a game that uses websockets for a bidirectional communication. This communication channel has arbitrary messages. To differentiate messages, there is a wrapper Right now there is a workaround that is to keep a list of |
Nope, I don't know such APIs. Another problem that you need some method that converts object to Parcel/Bundle to put it to Bundle, same to read it. If you decide that those points are not critical and use The solution when Serializable used but Parcelize is the best IMO, because covers all existing use cases of Pacelize and backward compatible + allows using different encoders/decoders as a bonus. |
@soywiz I think that approach with 'global map' is quite ad-hoc and for different message types, you probably want to use a polymorphic serialization, discussed in the 'appendix' section. There was mentioned a |
I've read the document and have to say that it is overall quite comprehensive and seems to have taken into account many considerations. I found one thing that I feel is an error. In the While the current interface is implementable and can provide the same information it is more cumbersome to access and requires the instantiation of "nullable descriptor" variants for all classes. To ensure that you only create single copies of these nullable variants is cumbersome. |
@pdvrieze Idea behind this is while optionality is an attribute of a property solely inside the structure, nullability is an attribute of an unbound type (use-site, indeed). A particular serializer works with type; therefore support of nullability is an attribute of serializer expressed in its descriptor. By default, serializers work with non-nullable types; special Also, root-type (where you start serialization or schema writing) can be nullable, but can't be optional. Replacing |
Hi guys, I'd like to suggest a better name for "Serialization", for me it should be Wire. I mean, something like
I took this name from the great Chronicle-Wire project. I've written my minimal "wire" library (with similar serialization-encoding separation) and this name is a pleasure to use in application code. |
@sandwwraith I see the point about the nullability being a property of the serializer. It requires the usage of the NullableSerializer "wrapper" not used currently, but that would also avoid the serializer copy issue. Based on my experiments it seems that using the wrappers provides for an overall better architecture with much less duplication. In that sense, perhaps the |
For your information, I've actually done a port of the new architecture on top of the existing code. I've used that to refactor my xml serialization library (https://github.com/pdvrieze/xmlutil). It actually works quite well and keeps things a lot more rational. The port has some smaller warts (things that cannot be determined with the old compiler plugin - missing info is missing info). One big advantage is that there is normally sufficient information to actually customize the serialization/deserialization as needed by the format. |
Aside from specific API issues, I believe that this issue is more general than simply being about serialisation. Because, in languages such as Java, Javascript, and Kotlin, objects construct a graph, not a tree. But to serialise them, we need a tree. (I can imagine there are other use cases, e.g. to display an object and its parts - or is that just a kind of serialisation to a screen!) Because programmers don't want to think about 'by-value' or 'by-reference'. (Like we used to have to do when all we had was C++.) but to convert an object graph into a tree, we have to do one of two things, either
The use of annotations as suggested, is kind of making the developer think about containment. But I think it is a flawed approach, or at least has significant problems.
I would like to propose/suggest/require the following things,
I.e for point three I can see different options, depending whether or not backwards compatibility a) Introduce new key words for containment by reference, i.e. One could then get the compiler to warn (Or error) about by value cycles. b) Have an additional keyword, I'm not sure whether a default as composite or reference makes most sense, |
@dhakehurst It is possible already to do this. You would have to use a custom encoder. Depending on your implementaiton it may need to be multi-pass. Basically it is an encoder that delegates to another encoder.
It may be worthwhile to create a set of standard encoders/decoders for this (and other) purposes, but they wouldn't be part of the core serialization library. (Other options would be serialization based equals and hashcode implementations). |
@dhakehurst I agree that modern programmers don't want to think about 'by-value' or 'by-reference' and they shouldn't think about it – thanks to the GC which can collect cycles and the fact that almost everything in Java(Kotlin) program is a reference. Therefore, it would be harmful to distinguish between 'reference' and 'composite' members. I disagree about removing annotation support, literally marking every class as serializable. First of all, there are requirements for the class to be serializable. Secondly, classes can encapsulate internal data/state that should no be visible to external clients, or such classes entirely can be an implementation detail. Thirdly, there is a security concern – if a malicious client can get our serialized state, then what? For using and serializing third-party libraries, there is a concept of external serializer which does not break encapsulation, since it uses only class public API. Such serializers are already supported in the framework. Regarding the mentioned problem with circular references and building a tree from the object graph: we've intentionally left it out of the scope. Popular formats like JSON do not have a standard format of such references. In some future, we can probably come up with internal kotlin serialization format (like Java Serialization), which can support such references. I believe this is only a matter of correct encoder/decoder, as @pdvrieze suggests. |
@dhakehurst: I agree with the goal of avoiding annotations. This seems related to schema saving: if a set of SerialDescriptors can itself be serialized and saved (and loaded for future use), then what capability does annotation enable that isn't supported by the schema? Runtime_usage says: That seems backwards - a Protobuf (or Thrift, or ASN.1, or Avro, or ... ) schema has tags; they are not unique to Protobuf serialization, and they are also useful for JSON and CBOR. Tags should be an intrinsic part of SerialDescriptor, not a format-specific annotation. Ordered types (list) have element positions, but unordered types (map, enum) do not.
You might want to serialize instances of this type as the index/tag instead of the name even in JSON, and in CBOR you'd always want to serialize simple map keys as tags instead of names. (Note: you mentioned uni-directional graphs, but minimal spanning trees are for un-directed graphs. The equivalent algorithm for directed graphs is: https://en.wikipedia.org/wiki/Edmonds%27_algorithm.) @sandwwraith: perhaps I'm missing something, but if a schema is autogenerated from a class, then it is serializable - if a class is not serializable then a schema cannot be autogenerated from it. Eliminating annotation decorators does not require every class to be serializable, but it might require setters that operate on SerialDescriptors, and/or the ability to delete SerialDescriptors for classes for which a schema can be autogenerated but attempts to serialize should for policy reasons throw an error. |
There are a lot of good ideas in your proposal, but I don't understand why you would want to rely on annotations on the target classes. You say that your proposal is serialization-format agnostic, but I don't see how (or I don't understand it). As others like @dhakehurst and @davaya pointed out, annotations do not really help us, here. You cannot add Knowing that a class can be serialized is format-agnostic. It just means that it does not contain any non-transient (and not marked so) field like a database connection. That's why the JVM solution of having a A class is typically unaware of the different formats it can be serialized to or from, so specializations should definitely belong to the serializers themselves. That's why here the JVM choice of the Basic data types, along with data classes, would already implement the Serializable interface. Standard containers should also do so, but I did not further investigate if/how the compiler magic would help us detect non-seriability problems at compile time, and the relationship to co/contra-variance (the idea would be to dispense us of the need to declare things like Now for transient fields, I guess we could either have a Also, since extension functions are statically resolved, we would much better have the |
@arkanovicz, my two cents here. These two seem to contradict each other:
The current framework allows you to declare external serializer to library classes which you cannot modify, but the idea with marker interface breaks the possibility to attach serializer entirely.
No, it doesn't.
They do in
This is where
Also present in |
@r4zzz4k Thanks for those clarifications. Is this KEEP still pertinent, then? |
@arkanovicz I'm not sure, let's wait for the clarification by @elizarov. Regardless of that, please check quite extensive |
@arkanovicz This KEEP should match the 1.0 implementation of kotlinx.serialization framework. Can you spot any contradictions? |
This in an issue to discuss proposal for Kotlin serialization compiler plugin.
The text was updated successfully, but these errors were encountered: