From 4d49630409ed1f1a5851f8f89533553c79115de8 Mon Sep 17 00:00:00 2001 From: Eric Hanson <5846501+ericphanson@users.noreply.github.com> Date: Tue, 22 Nov 2022 18:35:28 +0100 Subject: [PATCH] start upgrade guide, and add pieces from LegolasFlux upgrade (#71) * start upgrade guide, and add pieces from LegolasFlux upgrade * Update docs/src/upgrade.md * Update docs/src/upgrade.md Co-authored-by: Eric Hanson <5846501+ericphanson@users.noreply.github.com> * Update docs/src/upgrade.md Co-authored-by: Jarrett Revels * Apply suggestions from code review Co-authored-by: Jarrett Revels * Update docs/src/upgrade.md Co-authored-by: Jarrett Revels --- docs/make.jl | 3 ++- docs/src/upgrade.md | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) create mode 100644 docs/src/upgrade.md diff --git a/docs/make.jl b/docs/make.jl index da57a99..6da8037 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -7,7 +7,8 @@ makedocs(modules=[Legolas], pages=["API Documentation" => "index.md", "Schema-Related Concepts/Conventions" => "schema-concepts.md", "Arrow-Related Concepts/Conventions" => "arrow-concepts.md", - "FAQ" => "faq.md"]) + "FAQ" => "faq.md", + "Upgrading from v0.4 to v0.5" => "upgrade.md"]) deploydocs(repo="github.com/beacon-biosignals/Legolas.jl.git", push_preview=true, diff --git a/docs/src/upgrade.md b/docs/src/upgrade.md new file mode 100644 index 0000000..c137b61 --- /dev/null +++ b/docs/src/upgrade.md @@ -0,0 +1,22 @@ +# Upgrading from Legolas v0.4 to v0.5 + +This guide is incomplete; please add to it if you encounter items which would help other upgraders along their journey. + +See [here](https://github.com/beacon-biosignals/Legolas.jl/pull/54) for a comprehensive log of changes from Legolas v0.4 to Legolas v0.5. + +## Some main changes to be aware of + +* In Legolas v0.4, every `Legolas.Row` field's type was available as a type parameter of `Legolas.Row`; for example, the type of a field `y` specified as `y::Real` in a `Legolas.@row` declaration would be surfaced like `Legolas.Row{..., NamedTuple{(...,:y,...),Tuple{...,typeof(y),...}}`. In Legolas v0.5, the schema version author controls which fields have their types surfaced as type parameters in Legolas-generated record types via the `field::(<:F)` syntax in [`@version`](@ref). + * Additionally, to include type parameters associated to fields in a parent schema, they must be re-declared in the child schema. For example, the package LegolasFlux declares a `ModelV1` version with a field `weights::(<:Union{Missing,Weights})`. LegolasFlux includes an [example](https://github.com/beacon-biosignals/LegolasFlux.jl/blob/53c677848c6b65e5158ef2d43dd5f7eab174892e/examples/digits.jl#L78-L80) with a schema extension `DigitsRowV1` which extends `ModelV1`. This `@version` call must re-declare the field `weights` to be parametric in order for the `DigitsRowV1` struct to also have a type parameter for this field. +* In Legolas v0.4, `@row`-generated `Legolas.Row` constructors accepted and propagated any non-schema-required fields provided by the caller. In Legolas v0.5, `@version`-generated record type constructors will discard any non-schema-required fields provided by the caller. When upgrading code that formerly "implicitly extended" a given schema version by propagating non-required fields, it is advisable to instead explicitly declare a new extension of the schema version to capture the propagated fields as required fields; or, if it makes more sense for a given use case, one may instead define a new schema version that adds these propagated fields as required fields directly to the schema (likely declared as `::Union{Missing,T}` to allow them to be missing). + + +## Deserializing old tables with Legolas v0.5 + +Generally, tables serialized with earlier versions of Legolas can be de-serialized with Legolas v0.5, making it only a "code-breaking" change, rather than a "data-breaking" change. However, it is strongly suggested to have reference tests with checked in (pre-Legolas v0.5) serialized tables which are deserialized and verified during the tests, in order to be sure. + +Additionally, serialized Arrow tables containing nested Legolas-v0.4-defined `Legolas.Row` values (i.e. a table that contains a row that has a field that is, itself, a `Legolas.Row` value, or contains such values) require special handling to deserialize under Legolas v0.5, if you wish users to be able to deserialize them with `Legolas.read` using the Legolas-v0.5-ready version of your package. Note that these tables are still deserializable as plain Arrow tables regardless, so it may not be worthwhile to provide a bespoke deprecation/compatibility pathway in the Legolas-v0.5-ready version package unless your use case merits it (i.e. the impact surface would be high for your package's users). + +If you would like to provide such a pathway, though: + +Recall that under Legolas v0.4, `@row`-generated `Legolas.Row` constructors may accept and propagate arbitrary non-schema-required fields, whereas Legolas v0.5's `@version`-generated record types may only contain schema-required fields. Therefore, one must decide what to do with any non-required fields present in serialized `Legolas.Row` values upon deserialization. A common approach is to implement a deprecation/compatibility pathway within the relevant surrounding `@version` declaration. For example, [this LegolasFlux example](https://github.com/beacon-biosignals/LegolasFlux.jl/blob/53c677848c6b65e5158ef2d43dd5f7eab174892e/examples/digits.jl#L64-L84) uses a function `compat_config` to handle old `Legolas.Row` values, but does not add any handling for non-required fields, which will be discarded if present. If one did not want non-required fields to be discarded, these fields could be handled by throwing an error or warning, or defining a schema version extension that captured them, or defining a new version of the relevant schema to capture them (e.g. adding a field like `extras::Union{Missing, NamedTuple}`).