Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union deserialiser cannot handle the internal tag not being the first property #93

Closed
Happypig375 opened this issue Feb 23, 2021 · 14 comments · Fixed by #95
Closed

Union deserialiser cannot handle the internal tag not being the first property #93

Happypig375 opened this issue Feb 23, 2021 · 14 comments · Fixed by #95

Comments

@Happypig375
Copy link
Contributor

{"CommonProp":1,"Type":"4","Type4Prop":2}
type [<JsonFSharpConverter(JsonUnionEncoding.InternalTag ||| JsonUnionEncoding.UnwrapRecordCases, unionTagName="Type")>] DU =
| [<JsonPropertyName "3">] T3 of {| CommonProp:int; Type3Prop:string |}
| [<JsonPropertyName "4">] T4 of {| CommonProp:int; Type4Prop:int |}
System.Text.Json.JsonException: Failed to parse type DU: expected "Type", found PropertyName
@xperiandri
Copy link
Contributor

Same issue

@xperiandri
Copy link
Contributor

This is how I usually implement the reader

override __.Read (reader, _, _) =
    if reader.TokenType <> JsonTokenType.StartObject then raise <| JsonException ()
    let properties = Dictionary<string, obj>(4)
    reader.Read () |> ignore
    while reader.TokenType <> JsonTokenType.EndObject do
        let propertyName = reader.GetString ()
        reader.Read () |> ignore
        match propertyName with
        | "type" -> properties.Add (propertyName, reader.GetString () |> box)
        | "code" -> properties.Add (propertyName, reader.GetInt64 () |> box)
        | "description" -> properties.Add (propertyName, reader.GetString () |> box)
        | "imageUrl" -> properties.Add (propertyName, reader.GetString () |> box)
        | _ -> ()
        reader.Read () |> ignore

    match properties.TryGetValue "type" with
    | true, value ->
        match value with
        | :? string as value when value = "Dana" ->
            let code = Dana.create (properties.Item ("code") :?> int64)
            let description = properties.TryFind ("description") |> ValueOption.mapOption (fun d -> d :?> string)
            let imageUrl = properties.TryFind ("imageUrl") |> ValueOption.mapOption (fun d -> d :?> string |> Url.ofStringOrFail)
            DanaCode (code, description, imageUrl)
        | :? string as value when value.StartsWith "ISBN" ->
            let isbn =
                match value with
                | "ISBN10" -> (properties.Item ("code") :?> int64).ToString ("D10")
                | "ISBN13" -> (properties.Item ("code") :?> int64).ToString ("D13")
                | _ -> raise <| NotSupportedException ()
                |> ISBN.createOrFail
            let description = properties.TryFind ("description") |> ValueOption.mapOption (fun d -> d :?> string)
            let imageUrl = properties.TryFind ("imageUrl") |> ValueOption.mapOption (fun d -> d :?> string |> Url.ofStringOrFail)
            ISBNCode (isbn, description, imageUrl)
        | _ -> raise <| NotSupportedException ()
    | _, _ -> raise <| NotSupportedException ()

@xperiandri
Copy link
Contributor

What you can do for the general case is to read the whole object into a JsonDocument

let readObjectAsString (reader: byref<Utf8JsonReader>) =
    use document = JsonDocument.ParseValue (&reader)

Get the discriminator from it and then deserialize into a right type

@xperiandri
Copy link
Contributor

@Tarmil, what do you think about this option?

@Tarmil
Copy link
Owner

Tarmil commented Mar 30, 2021

This code makes an assumption that is unfortunately not true in general; namely, that a field with a given name always has the same type. You couldn't write a parser in this style for a type like this:

type DU =
  | A of x: T1
  | B of x: T2

because if you find an "x" before the tag field, you don't know whether to read it as a T1 or a T2.

Now, that being said, we could figure out while constructing the JsonConverter whether a union type is readable out of order or not. If it is, then in the parser, if the first field isn't the tag, then we can switch to an alternate implementation.

JsonDocument isn't an option unfortunately, because we can't read a field with its appropriate JsonConverter from a JsonElement.

@xperiandri
Copy link
Contributor

But you can put that JsonDocoment into a MemoryStream and return back to ability to use converters

@Happypig375
Copy link
Contributor Author

Here is my workaround:

    // https://github.com/Tarmil/FSharp.SystemTextJson/issues/93
    type [<Struct; JsonConverter(typeof<Issue93QuickFixConverter>)>] Issue93QuickFix<'T>(quickFix:'T) =
        member _.QuickFix = quickFix
        override _.ToString() = sprintf "Issue93QuickFix %+A" quickFix
    and Issue93QuickFixConverter<'T>() =
        inherit JsonConverter<Issue93QuickFix<'T>>()
        let nameToLookFor =
            typeof<'T>.GetCustomAttributes(typeof<JsonFSharpConverterAttribute>, true) |> function
            | [||] -> "Case"
            | attributes ->
            attributes.[0] :?> JsonFSharpConverterAttribute
            |> typeof<JsonFSharpConverterAttribute>.GetField("fsOptions", Reflection.BindingFlags.NonPublic ||| Reflection.BindingFlags.Instance).GetValue
            :?> JsonFSharpOptions
            |> fun x -> x.UnionTagName
        override _.Read(reader, _, options) =
            let originalCount = reader.BytesConsumed
            let d = JsonDocument.ParseValue &reader
            let mutable v = Unchecked.defaultof<_>
            for p in d.RootElement.EnumerateObject() do if p.Name = nameToLookFor then v <- p
            if v = Unchecked.defaultof<_> then failwithf "Union tag name %s not found." nameToLookFor
            use stream = new IO.MemoryStream(reader.BytesConsumed - originalCount |> int)
            use w = new Utf8JsonWriter(stream)
            w.WriteStartObject()
            v.WriteTo w
            for p in d.RootElement.EnumerateObject() do if p.Name <> nameToLookFor then p.WriteTo w
            w.WriteEndObject()
            w.Flush()
            Issue93QuickFix <| JsonSerializer.Deserialize<'T>(ReadOnlySpan<_>(stream.GetBuffer(), 0, int stream.Length), options)

@xperiandri
Copy link
Contributor

xperiandri commented Mar 30, 2021

@Happypig375

let discriminator =
    match d.RootElement.TryGetValue nameToLookFor with
    | true, value -> 
    | false, _ -> failwithf "Union tag name %s not found." nameToLookFor

And document has a method to write it into a stream.
No need to enumerate an object at all.
And deserialize allows passing a type as 2nd parameter instead of <'T>.

@Happypig375
Copy link
Contributor Author

The document method is not applicable here since we don't want to write the tag twice, so there is a filter here. TryGetValue may help simplify the code though.

@Tarmil
Copy link
Owner

Tarmil commented Apr 1, 2021

The JsonDocument approach is sure to be a big performance hit, since it means parsing, writing and re-parsing the object. So I'd rather make it an explicit choice from the user. I think a good approach would be:

  • If the first field is the tag, then use the current reader.
  • Else if the type has no ambiguities (ie fields with same name and different types), then use @xperiandri's first proposed algorithm.
  • Else if a new JsonUnionEncoding flag is set then use JsonDocument.
  • Else fail.

@xperiandri
Copy link
Contributor

xperiandri commented Apr 1, 2021

I suppose that you can avoid memory copying by using slices. That way there will be no performance hit, only a lookup of a union tag

@Tarmil
Copy link
Owner

Tarmil commented Apr 2, 2021

Looking up the tag in the JsonDocument isn't the performance worry; creating the JsonDocument, then writing it, then Deserializing the written result is.

@xperiandri
Copy link
Contributor

So my proposal is:
From Utf8JsonReader you get the System.Buffers.ReadOnlySequence<byte> https://docs.microsoft.com/en-us/dotnet/api/system.text.json.utf8jsonreader.valuesequence?view=net-5.0#System_Text_Json_Utf8JsonReader_ValueSequence
That sequence you parse with https://docs.microsoft.com/en-us/dotnet/api/system.text.json.jsondocument.parse?view=net-5.0

This way no copy happens

@xperiandri
Copy link
Contributor

@Tarmil could you try my solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants