-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor package to remove sharp edges for schema authors/users (part…
…icularly `@row` / `Row`) and improve API (#54) Co-authored-by: Seth Chapman <[email protected]> Co-authored-by: Alex Arslan <[email protected]> Co-authored-by: Eric Hanson <[email protected]>
- Loading branch information
1 parent
3a97820
commit 1561ef4
Showing
16 changed files
with
1,581 additions
and
809 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,7 @@ jobs: | |
matrix: | ||
version: | ||
- '1' | ||
- '1.3' | ||
- '1.6' | ||
os: | ||
- ubuntu-latest | ||
arch: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,23 @@ | ||
name = "Legolas" | ||
uuid = "741b9549-f6ed-4911-9fbf-4a1c0c97f0cd" | ||
authors = ["Beacon Biosignals, Inc."] | ||
version = "0.4.0" | ||
version = "0.5.0" | ||
|
||
[deps] | ||
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45" | ||
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" | ||
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" | ||
|
||
[compat] | ||
Arrow = "2" | ||
DataFrames = "1" | ||
Tables = "1.4" | ||
julia = "1.3" | ||
julia = "1.6" | ||
|
||
[extras] | ||
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" | ||
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" | ||
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" | ||
|
||
[targets] | ||
test = ["Test", "DataFrames"] | ||
test = ["Test", "DataFrames", "UUIDs"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Arrow-Related Concepts/Conventions | ||
|
||
!!! note | ||
|
||
If you're a newcomer to Legolas.jl, please familiarize yourself with the [tour](https://github.com/beacon-biosignals/Legolas.jl/blob/main/examples/tour.jl) before diving into this documentation. | ||
|
||
Legolas.jl's target (de)serialization format, [Arrow](https://arrow.apache.org/), already features wide cross-language adoption, enabling Legolas-serialized tables to be seamlessly read into many non-Julia environments. This documentation section contains conventions related to Legolas-serialized Arrow tables that may be observable by generic Legolas-unaware Arrow consumers. | ||
|
||
## Supporting Legolas Schema Discovery In Arrow Tables | ||
|
||
Legolas defines a special field `legolas_schema_qualified` that Legolas-aware Arrow writers may include in an Arrow table's table-level metadata to indicate a particular Legolas schema with which the table complies. | ||
|
||
Arrow tables which include this field are considered to "support Legolas schema discovery" and are referred to as "Legolas-discoverable", since Legolas consumers may employ this field to automatically match the table against available application-layer Legolas schema definitions. | ||
|
||
If present, the `legolas_schema_qualified` field's value must be a [fully qualified schema version identifier](@ref schema_version_identifier_specification). | ||
|
||
## Arrow File Naming Conventions | ||
|
||
When writing a Legolas-discoverable Arrow table to a file, prefer using the file extension `*.<schema name>.arrow`. For example, if the file's table's full Legolas schema version identifier is `baz.supercar@1>bar.automobile@1`, use the file extension `*.baz.supercar.arrow`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Schema-Related Concepts/Conventions | ||
|
||
!!! note | ||
|
||
If you're a newcomer to Legolas.jl, please familiarize yourself with the [tour](https://github.com/beacon-biosignals/Legolas.jl/blob/main/examples/tour.jl) before diving into this documentation. | ||
|
||
## [Schema Version Identifiers](@id schema_version_identifier_specification) | ||
|
||
Legolas defines "schema version identifiers" as strings of the form: | ||
|
||
- `name@version` where: | ||
- `name` is a lowercase alphanumeric string and may include the special characters `.` and `-`. | ||
- `version` is a non-negative integer. | ||
- or, `x>y` where `x` and `y` are valid schema version identifiers and `>` denotes "extends from". | ||
|
||
A schema version identifier is said to be *fully qualified* if it includes the identifiers of all ancestors of the particular schema version that it directly identifies. | ||
|
||
Schema authors should follow the below conventions when choosing the name of a new schema: | ||
|
||
1. Include a namespace. For example, assuming the schema is defined in a package Foo.jl, `foo.automobile` is good, `automobile` is bad. | ||
2. Prefer singular over plural. For example, `foo.automobile` is good, `foo.automobiles` is bad. | ||
3. Don't "overqualify" a schema name with ancestor-derived information that is better captured by the fully qualified identifier of a specific schema version. For example, `bar.automobile` should be preferred over `bar.foo.automobile`, since `bar.automobile@1>foo.automobile@1` is preferable to `bar.foo.automobile@1>foo.automobile@1`. Similarly, `baz.supercar` should be preferred over `baz.automobile.supercar`, since `baz.supercar@1>bar.automobile@1` is preferable to `baz.automobile.supercar@1>bar.automobile@1`. | ||
|
||
## Schema Versioning: You Break It, You Bump It | ||
|
||
While it is fairly established practice to [semantically version source code](https://semver.org/), the world of data/artifact versioning is a bit more varied. As presented in the tour, each `Legolas.SchemaVersion` carries a single version integer. The central rule that governs Legolas' schema versioning approach is: | ||
|
||
**Do not introduce a change to an existing schema version that might cause existing compliant data to become non-compliant; instead, incorporate the intended change in a new schema version whose version number is one greater than the previous version number.** | ||
|
||
For example, a schema author must introduce a new schema version for any of the following changes: | ||
|
||
- A new type-restricted required field is added to the schema. | ||
- An existing required field's type restriction is tightened. | ||
- An existing required field is renamed. | ||
|
||
One benefit of Legolas' approach is that multiple schema versions may be defined in the same codebase, e.g. there's nothing that prevents `@version(FooV1, ...)` and `@version(FooV2, ...)` from being defined and utilized simultaneously. The source code that defines any given Legolas schema version and/or consumes/produces Legolas tables is presumably already semantically versioned, such that consumer/producer packages can determine their compatibility with each other in the usual manner via interpreting major/minor/patch increments. | ||
|
||
Note that it is preferable to avoid introducing new versions of an existing schema, if possible, in order to minimize code/data churn for downstream producers/consumers. Thus, authors should prefer conservative field type restrictions from the get-go. Remember: loosening a field type restriction is not a breaking change, but tightening one is. | ||
|
||
## Important Expectations Regarding Custom Field Assignments | ||
|
||
Schema authors should ensure that their `@version` declarations meet two important expectations so that generated record types behaves as intended: | ||
|
||
1. Custom field assignments should preserve the [idempotency](https://en.wikipedia.org/wiki/Idempotence) of record type constructors. | ||
2. Custom field assignments should not observe mutable non-local state. | ||
|
||
Thus, given a Legolas-generated record type `R`, the following should hold for all valid values of `fields`: | ||
|
||
```jl | ||
R(R(fields)) == R(fields) | ||
R(fields) == R(fields) | ||
``` |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.
1561ef4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register
1561ef4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/71186
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: