Skip to content

Commit

Permalink
start upgrade guide, and add pieces from LegolasFlux upgrade (#71)
Browse files Browse the repository at this point in the history
* start upgrade guide, and add pieces from LegolasFlux upgrade

* Update docs/src/upgrade.md

* Update docs/src/upgrade.md

Co-authored-by: Eric Hanson <[email protected]>

* Update docs/src/upgrade.md

Co-authored-by: Jarrett Revels <[email protected]>

* Apply suggestions from code review

Co-authored-by: Jarrett Revels <[email protected]>

* Update docs/src/upgrade.md

Co-authored-by: Jarrett Revels <[email protected]>
  • Loading branch information
ericphanson and jrevels authored Nov 22, 2022
1 parent 68ffeb3 commit 4d49630
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 1 deletion.
3 changes: 2 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ makedocs(modules=[Legolas],
pages=["API Documentation" => "index.md",
"Schema-Related Concepts/Conventions" => "schema-concepts.md",
"Arrow-Related Concepts/Conventions" => "arrow-concepts.md",
"FAQ" => "faq.md"])
"FAQ" => "faq.md",
"Upgrading from v0.4 to v0.5" => "upgrade.md"])

deploydocs(repo="github.com/beacon-biosignals/Legolas.jl.git",
push_preview=true,
Expand Down
22 changes: 22 additions & 0 deletions docs/src/upgrade.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Upgrading from Legolas v0.4 to v0.5

This guide is incomplete; please add to it if you encounter items which would help other upgraders along their journey.

See [here](https://github.com/beacon-biosignals/Legolas.jl/pull/54) for a comprehensive log of changes from Legolas v0.4 to Legolas v0.5.

## Some main changes to be aware of

* In Legolas v0.4, every `Legolas.Row` field's type was available as a type parameter of `Legolas.Row`; for example, the type of a field `y` specified as `y::Real` in a `Legolas.@row` declaration would be surfaced like `Legolas.Row{..., NamedTuple{(...,:y,...),Tuple{...,typeof(y),...}}`. In Legolas v0.5, the schema version author controls which fields have their types surfaced as type parameters in Legolas-generated record types via the `field::(<:F)` syntax in [`@version`](@ref).
* Additionally, to include type parameters associated to fields in a parent schema, they must be re-declared in the child schema. For example, the package LegolasFlux declares a `ModelV1` version with a field `weights::(<:Union{Missing,Weights})`. LegolasFlux includes an [example](https://github.com/beacon-biosignals/LegolasFlux.jl/blob/53c677848c6b65e5158ef2d43dd5f7eab174892e/examples/digits.jl#L78-L80) with a schema extension `DigitsRowV1` which extends `ModelV1`. This `@version` call must re-declare the field `weights` to be parametric in order for the `DigitsRowV1` struct to also have a type parameter for this field.
* In Legolas v0.4, `@row`-generated `Legolas.Row` constructors accepted and propagated any non-schema-required fields provided by the caller. In Legolas v0.5, `@version`-generated record type constructors will discard any non-schema-required fields provided by the caller. When upgrading code that formerly "implicitly extended" a given schema version by propagating non-required fields, it is advisable to instead explicitly declare a new extension of the schema version to capture the propagated fields as required fields; or, if it makes more sense for a given use case, one may instead define a new schema version that adds these propagated fields as required fields directly to the schema (likely declared as `::Union{Missing,T}` to allow them to be missing).


## Deserializing old tables with Legolas v0.5

Generally, tables serialized with earlier versions of Legolas can be de-serialized with Legolas v0.5, making it only a "code-breaking" change, rather than a "data-breaking" change. However, it is strongly suggested to have reference tests with checked in (pre-Legolas v0.5) serialized tables which are deserialized and verified during the tests, in order to be sure.

Additionally, serialized Arrow tables containing nested Legolas-v0.4-defined `Legolas.Row` values (i.e. a table that contains a row that has a field that is, itself, a `Legolas.Row` value, or contains such values) require special handling to deserialize under Legolas v0.5, if you wish users to be able to deserialize them with `Legolas.read` using the Legolas-v0.5-ready version of your package. Note that these tables are still deserializable as plain Arrow tables regardless, so it may not be worthwhile to provide a bespoke deprecation/compatibility pathway in the Legolas-v0.5-ready version package unless your use case merits it (i.e. the impact surface would be high for your package's users).

If you would like to provide such a pathway, though:

Recall that under Legolas v0.4, `@row`-generated `Legolas.Row` constructors may accept and propagate arbitrary non-schema-required fields, whereas Legolas v0.5's `@version`-generated record types may only contain schema-required fields. Therefore, one must decide what to do with any non-required fields present in serialized `Legolas.Row` values upon deserialization. A common approach is to implement a deprecation/compatibility pathway within the relevant surrounding `@version` declaration. For example, [this LegolasFlux example](https://github.com/beacon-biosignals/LegolasFlux.jl/blob/53c677848c6b65e5158ef2d43dd5f7eab174892e/examples/digits.jl#L64-L84) uses a function `compat_config` to handle old `Legolas.Row` values, but does not add any handling for non-required fields, which will be discarded if present. If one did not want non-required fields to be discarded, these fields could be handled by throwing an error or warning, or defining a schema version extension that captured them, or defining a new version of the relevant schema to capture them (e.g. adding a field like `extras::Union{Missing, NamedTuple}`).

0 comments on commit 4d49630

Please sign in to comment.