Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickler derivation #3134

Merged
merged 52 commits into from
Sep 19, 2023
Merged

Pickler derivation #3134

merged 52 commits into from
Sep 19, 2023

Conversation

kciesielski
Copy link
Member

@kciesielski kciesielski commented Aug 24, 2023

  • Basic Pickler derivation + autoderivation
  • Support toEncodedName from global derivation Configuration (snake case, etc)
  • Support discriminator (key) from Configuration
  • Support toDiscriminatorValue from Configuration
  • Support encoding discriminator with oneOfUsingField, similar to Schema.oneOfUsingField
  • Support providing Pickler[Option[T]]
    • serialize None as null - same as default behavior in Circe
  • Support arrays, iterables, maps, eithers, etc.
    • Either[A, B] is encoded as [0, valueA] or [1, valueB]
  • Support generating nice Writers/Readers for case classes extending AnyVal
  • Support encoding discriminator with oneOfWrapped (btw this is how Circe encodes by default)
  • Support simple derivation for enums
  • Support derivation for enums with custom value encoding
    • Only as String, as uPickle limitations are pretty hard to overcome here. It may be quite difficult to achieve a "enumValue": 3 type of encoding, best we can get now is "enumValue": "3".
  • Support first-successful deserialization strategy
  • Support per-field encoded name, read from schema @encodedName annotation
  • Support per-field default value, read from schema @default annotation
  • Prioritize @default over case class defaults, also always serialize fields, even if they are equal to default values
  • Support for @customise annotation on case classes (SchemaDerivation macro does not support this ATM)
  • Support for @description annotation on case classes
    // -------------- TODO
  • Better errors (https://github.com/softwaremill/tapir/pull/3134/files#r1324224536)
  • Possibility to skip nulls (representing Option.None) entirely
  • Support per-field encoded name, read from a dynamic config provided to the pickler
  • Support per-field default value, read from a dynamic config provided to the pickler

@kciesielski
Copy link
Member Author

kciesielski commented Aug 24, 2023

@adamw Can you take a look at this scaffolding idea for the derivation? It's based on https://docs.scala-lang.org/scala3/reference/contextual/derivation.html#

  • It makes the case product / sum split on inline def level, and I expect these blocks to be filled with macro calls which generate ReadWriter instances based on summoned child picklers and their individual readwriter instances for fields.
  • This part above can be reused in an optimized implementation, where we derive Schema[T] in a new way. Building new Schema and ReadWriter (using child schemas and readwriters) will then be combined.
  • Ignore the DefaultReadWriterWrapper, it's a leftover of my experiments with overriding upickle, it will probably go away.
  • If we want excellent error messages, then using summonFrom or summonInline may be not the way to go? Using Expr.summon[] in macros returns an Option and allows catching missing implicits so maybe we should go this way from the very beginning?

@adamw
Copy link
Member

adamw commented Aug 24, 2023

It makes the case product / sum split on inline def level, and I expect these blocks to be filled with macro calls which generate ReadWriter instances based on summoned child picklers and their individual readwriter instances for fields.

Yes, that's the idea, hopefully it will work ;)

This part above can be reused in an optimized implementation, where we derive Schema[T] in a new way. Building new Schema and ReadWriter (using child schemas and readwriters) will then be combined.

I guess let's first make a working version, then optimize ;) It's tricky esp with recursion

If we want excellent error messages, then using summonFrom or summonInline may be not the way to go? Using Expr.summon[] in macros returns an Option and allows catching missing implicits so maybe we should go this way from the very beginning?

But you still do an inline match on the result, so you know when summoning fails and when it succeeds - then in the non-product & non-coproduct & summon-failed case, you can just do an inline compilation error? https://docs.scala-lang.org/scala3/guides/macros/compiletime.html#reporting

@kciesielski kciesielski linked an issue Sep 4, 2023 that may be closed by this pull request
@kciesielski kciesielski marked this pull request as ready for review September 12, 2023 14:42
@kciesielski kciesielski requested a review from adamw September 12, 2023 14:42
@kciesielski kciesielski changed the title [WIP] Pickler derivation Pickler derivation Sep 15, 2023
@@ -15,12 +15,12 @@ fields, or all of the implementations of the `enum`/`sealed trait`/`sealed class

Two policies of custom type derivation are available:

* automatic derivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's some auto-formatting? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, it's probably my IDE that tries to force something here. I'll revert it.

import scala.reflect.ClassTag

/**
* A modification of upickle.implicits.Writers, implemented in order to provide our custom JSON encoding and typeclass derivation logic:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] is this formatted using scalafmt? seems not, as usually long scaladocs are nicely wrapped

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scalafmt rewrites the points to one single block

/** A modification of upickle.implicits.Writers, implemented in order to provide our custom JSON encoding and typeclass derivation logic:
  *   1. A CaseClassWriter[T] is built based on writers for child fields passed as an argument, instead of just summoning these writers.
  *      This allows us to operate on Picklers and use Writers extracted from these Picklers. Summoning is now done on Pickler, not Writer
  *      level. 2. Default values can be passed as parameters, which are read from Schema annotations if present. Vanilla uPickle reads
  *      defaults only from case class defaults. 3. Subtype discriminator can be passed as a parameter, allowing specyfing custom key for
  *      discriminator field, as well as function for extracting discriminator value 4. Schema is passed as a parameter, so that we can use
  *      its encodedName to transform field keys 5. Configuration can be used for setting discrtiminator field name or encoding all field
  *      names according to custom function (allowing transformations like snake_case, etc.)
  */

I'm not sure if I like it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to use proper scaladoc formatting: https://docs.scala-lang.org/overviews/scaladoc/for-library-authors.html, The markup for list blocks looks like:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, this works. Should have googled it of course.

case p: Mirror.ProductOf[T] => picklerProduct(p, childPicklers)
case _: Mirror.SumOf[T] =>
val schema: Schema[T] =
inline if (isScalaEnum[T])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this enough? that is, won't this also catch cases of scala 3 enums which have parameters, in which case they desugar to a sealed trait hierarchy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed catches enums with fields. This means, that:

    // given
    enum RichColorEnum(val code: Int):
      case Cyan extends RichColorEnum(3)
      case Magenta extends RichColorEnum(18)

    case class RichColorResponse(color: RichColorEnum)

    // when
    val picklerResponse = Pickler.derived[RichColorResponse]
    val codec = picklerResponse.toCodec
    val inputObj = RichColorResponse(RichColorEnum.Cyan)
    val encoded = codec.encode(inputObj)

    // then
    encoded shouldBe """{"color":"Cyan"}"""

Would you rather expect the code field to be encoded as well? If someone used enum to represent this, then I would expect it to be treated as enumeration with simple values in the schema and JSON model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here - no, as the enum cases themselves are not parametrised. But you can have

enum Entity:
  case Person(first: String, age: Int)
  case Business(address: String)

which desugars to a sealed trait + case class hierarchy. In essence scala 3 enums have two flavors: when all cases are parameterless, then you have an enumeration of values, and you can represent this using just the label. (Also, the compiler won't generate a separate type for such cases). Or, if you have parameters, then this is syntax sugar for sealed trait + case classes. Both need to be treated differently: the first should be handled using derivedEnumeration, the second as a coproduct (sum type), using inheritance strategies etc.

I think we have this kind of check somewhere in the code, maybe in Schema.derivedEnumeration? For sure it mentions parameterlessness in the docs, not sure if there's an actual check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw would be great to have a test case for each of those (at least 3 - very simple enums, parameterless enums with base type parameters, parametrised enums). The existing schema tests might be messy, so maybe it would make sense to have some separate tests as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, thanks for explaining. I don't like this idea of allowing writing sealed hierarchies as enums, seems like unnecessary facilitation of already simple thing, and makes it unlcear whether it's an enum or "not really an enum". There's indeed a nice check you've implemented in Validator.derivedEnumeration[T] which fails compilation for such cases.
I can add this check to our isScalaEnum utility and return false, which would make the logic go through a standard product path and produce

{ "$type": "Business", "address": "221B Baker Street" }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's how enum-s work, I think they are designed to provide a more succint syntax for sealed trait hierarchies in the first place, and act as enumeratiojs of values as an addition. But either way we need to support both :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should do the trick then f5879e7

inline if (isScalaEnum[T])
Schema.derivedEnumeration[T].defaultStringBased
else
Schema.derived[T]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit suspicious that products & coproducts are handled differently ;) shouldn't we use subtypeDiscriminator when deriving coproduct schemas as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed unclear. Looks like I'm using the subtype discriminator for both product and sum, while it's unnecessary for product. Additionaly, this method is called only in one place, with DefaultSubtypeDiscriminator, so it's also an unneeded parameter on this level.
We use discriminator only when it's "non-default", and it goes through another path, not buildNewPickler. This will hopefully become clearer when I remove the discriminator parameter from buildNewPickler. Just pushed.

@adamw adamw merged commit 373815c into master Sep 19, 2023
@mergify mergify bot deleted the feature/pickler-derivation branch September 19, 2023 22:56
@adamw
Copy link
Member

adamw commented Sep 19, 2023

Thanks! It was a lot of work but great to have the first step in the pickler implementation done :)

* An in-scope pickler instance is required by [[jsonBody]] (and its variants), but it can also be manually converted to a codec using
* [[Pickler.toCodec]].
*/
@implicitNotFound(msg = """Could not summon a Pickler for type ${T}.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that you can refer to type like this 😮

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Module for JSON codec derivation
2 participants