-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developers apps using JSON serialization start up and run faster #1568
Comments
In theory, any startup-only reflection/delegate initialization can be done AOT. Popular scenarios including:
Please consider build some infrastructure to let the library provide AOT generation. |
The existing design depends on either manual storage of the However there is an first-time perf hit of initializing the options for each new Type encountered; this involves using reflection to lookup the properties and various attributes. See issue #1562 which could be used to help facilitate custom converters per POCO type and collection type which for performance will likely be generated IL (run-time or ahead-of-time) and\or Roslyn generated source pending requirements\design. This wouldn't require the reflection hit. |
@mconnew again thank you for insights and experience here. One note as @layomia also pointed out is that the current code-gen is not about generating self-contained "serializers" but about generating metadata and callbacks:
This achieves the primary goals of fast startup and minimizing private bytes both done by avoid reflection and reflection emit. A secondary goal of increased throughput occurs when the generated callbacks for serialize()\deserialize() can be used. Another secondary goal is to support the ILLinker for to reduce the size of STJ.dll.
The current design and constraints of Roslyn source generators mean:
The The new "context class" programming model is a pay-to-play meaning generated code is directly called at run-time for each type, and thus only that code is JITTed. If there are 1,000 generated types, for example, only the ones accessed at run-time (by calling the appropriate member on the context class) should be JITTed (along with any dependent generated types). This lazy JIT assumption of course should be verified. |
Can we update the issue description to make sure this item tracks both, perf improvements as well as trimming? IOW, we need make it clear that the path towards making JSON serializable types trimmable is via source generation. |
Thanks @terrajobst. I've added notes about goals to facilitate more trimming (removing unused converters, reflection code-paths) and be linker friendly (due to avoiding run-time reflection) and action items (#36782, https://github.com/dotnet/runtimelab/projects/1#card-49468644). |
This issue was originally created to track multiple goals achievable with AOT source generation including:
I created #45441 to track the user story "Developers can safely trim their apps which use System.Text.Json to reduce the size of their apps" which depends on the work in this issue. |
@layomia we're trying to title all our User Stories in terms of customer benefit (WHO gets WHAT), so we focus on the result we are aiming for. Stories can depend on each other, but the actual work is in the issues parented by the stories. So your title for #45441 is perfect, and I've retitled this story in that format as well. Feel free to adjust. |
@layomia looking a little more here, I think this user story is missing the child issues that encompass the work required to achieve it. We should have issues for the various parts of the source generator work -- I assume you have an idea what those parts are, you will want to create them at some point and parent them under this story. I suggest something like
does that seem reasonable? |
Thanks @danmosemsft that makes sense. I created #45448 (to be further fleshed out) to track the source generation work items which should satisfy these user stories
|
With the JSON source generator checked in, we can consider this work done. Please see "Try the new System.Text.Json source generator" for an end-to-end overview of how the generator works and its benefits. |
Original proposal by @jkotas (click to view)
The generation of Json serializers via reflection at runtime has non-trivial startup costs. This has been identified as a bottleneck during prototyping of fast small cloud-first micro-services:
Repro: https://gist.github.com/jkotas/b0671e154791e287c38a627ca81d7197
The Json serializer generated using reflection at runtime has startup cost ~30ms. The manually written Json serializer has startup cost ~1ms.
Edited by @kevinwkt and @layomia :
Background
There are comprehensive documents detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are improved startup time, reduction in private memory usage, faster throughput for serialization and deserialization, and being ILLinker-friendly due to avoiding reflection at run-time. There is also an opportunity to reduce the size of the trimmed System.Text.Json.dll after source generation and linker trimming due to code-paths that use reflection being potentially removed, and also unused built-in
JsonConverter<T>
s such asUri
,Ulong64
etc.After discussing some approaches and pros/cons of some of them we decided to implement this feature using Roslyn source generators. Implementation details and code/usage examples can be seen in the design document. This document will outline the roadmap for the initial experiment and highlight actionable items.
This project requires numerous API changes and the design is being iterated on which is why we will be using the dotnet/runtimelab repository instead of dotnet/runtime. The main goal of this project is to get something up and running while changing implementation and iterating on public API without committing to dotnet/runtime master. We hope to share the project and get feedback for potential release on .NET 6.0. The project will be consumable through a prerelease package until then. Progress can be tracked through the JSON Code Gen project board in dotnet/runtimelab.
Approach
There are 3 main points in this project: type discovery, source code generation, generated source code integration (with user applications).
Type discovery
Type discovery can be thought of in two ways, an implicit model (where the user does not have to specify which types to generate code for) and an explicit model (user specifies through code or configuration which types to generate code for).
Various implicit approaches have been discussed such as source generating for all partial classes or scanning for calls into the serializer using Roslyn tree syntax. These models can be revisited in the future as the value/feasibility of the approach becomes clearer based on user feedback. It is important to note that some downsides to such a model include missing types to generate source for or generating source for types when not needed due to a bug or edge cases we didn’t consider.
The proposed approach for type discovery requires an explicit indication of serializable types by the user. This model supports indicating both owned and non-owned types. A new
JsonSerializableAttribute
will be used to detect these types. There are two patterns forJsonSerializiableAttribute
. The first consists of applying the attribute on a type that the user owns, and the second consists of the user passing into the constructor of the attribute a non-owned serializable type.We believe that an explicit model using attributes would be a simple first-approach to the problem. Within the Roslyn source generator, we parse the syntax tree to find usages of the
JsonSerializableAttribute
. The output of this phase would be a list of input types for the generator in order to code-gen recursively for each type in all the object graphs.Source code generation
The design for the generated source focuses mainly on performance gains and extensibility to existing
JsonSerializer
functionality. Performance is improved in two ways. The first is during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache during runtime and moving it to compile time. These type metadata are then represented asJsonTypeInfo
classes that can be used for (de)serialization at runtime. The second is throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating an instance of the type’sJsonTypeInfo
(metadata). These instances will be passed to new (de)serialize overloads.We will use the types discovered in the type discovery phase and recurse through the type graph in order to source generate the functions mentioned above within each
JsonTypeInfo
and register them inside the user-facing wrapperJsonSerializerContext
.Generated source code integration
There are discussions regarding integration of generated metadata source code with user apps. The proposed approach consists of the generator creating a context class (
JsonSerializerContext
) which takes an options instance and contains references to the generatedJsonTypeInfos
for each type seen above. This relies on the creation of new overloads to the current serializer mentioned before that can be retrieved from the context. An example of the overload and usage can be seen here, while examples and details of the end to end approach can be seen in the design document.Action items
Progress of this effort can be observed through the JSON Code Gen project board in dotnet/runtimelab.
The source generator (System.Text.Json.SourceGeneration.dll) and updated System.Text.Json.dll can be consumed via an experimental NuGet package. Issues can be logged at https://github.com/dotnet/runtimelab/issues/new?labels=area-JsonCodeGen with the
area-JsonCodeGen
label.cc @jkotas @davidfowl @stephentoub @mjsabby @terrajobst @pranavkm @ericstj @layomia @steveharter @chsienki
The text was updated successfully, but these errors were encountered: