schema fixups #10

rvagg · 2019-05-02T07:25:06Z

Take 2, or 3 or something. Trying to narrow down the parsing rules and the AST (not technically an AST but let's call it that until we have a better term).

My first take at this had "representation":{"map":{}} showing up for all structs and maps (and other things which it shouldn't have) but as I expanded into the other base types it made less sense. Do you also need "representation":{"list":{}} for a list type, "representation":{"int":{}} for an int? So I went with a rule: where there is a default representation for a type (map for struct, string for enum, list for list, bytes for bytes etc.) as described in the schema spec PR, don't include it if it doesn't add anything useful.

Also, I've seen you talk about field aliases in map representations in a couple of places now but they aren't in schema-schema.ipldsch or the spec PR, are you intending these to be included now or something to expand to later?

warpfork · 2019-05-02T08:09:10Z

So I went with a rule: where there is a default representation for a type (map for struct, string for enum, list for list, bytes for bytes etc.) as described in the schema spec PR, don't include it if it doesn't add anything useful.

👍

An alternative but very similar rule we could apply is "for those that don't yet have more than one representation, we don't store a field for it". The effect would be string and int and such don't get anything, but structs would still be explicit about representing as maps in the JSON form even though that's the implied default in the DSL.

Either way we should be making sure the schema-schema self-describes properly; and the higher level rule we're talking about here is just how we decide which fields on types are actually present in the schema-schema and whether those fields are optional / have default values.

And I guess it's the case that the DSL has some slightly stronger feelings about implied defaults than the schema-schema itself does. This isn't explicitly said anywhere yet, so maybe we should fine a place to document that. Basically, I want the IPLD format matching the schema-schema to stay extremely stable over time (and if anything, use the schema system's own migration mitigation tools over time), and I care a fair bit less about the stability of the human-convenience DSL (which should still be "stable" for our sanity, of course, but having a "2.0" heyday in the DSL that changes some defaults is going to have lot more managable splash radius than doing so in the IPLD format that we'll orient most tools around).

warpfork · 2019-05-02T08:14:38Z

typed/declaration/examples.ipldsch.json

@@ -10,47 +10,43 @@
 			"kind": "struct",
 			"fields": {
 				"fooField": {
-					"valueType": {
-						"type": "map",
+					"type": {


Oh, jeesh. Yep, this whole block of changes is a fair cop.

warpfork · 2019-05-02T08:21:06Z

typed/declaration/schema-schema.ipldsch.json

-			"fields": {},
-			"representation": {
-				"map": {}
-			}


For all these, whether or not we want them here depends on which of the two rules we prefer.

But right now, the final word should match the field defns for TypeStruct -- which have representation not being an optional field, so these should stay in by that logic.

warpfork · 2019-05-02T08:23:00Z

Also, I've seen you talk about field aliases in map representations in a couple of places now but they aren't in schema-schema.ipldsch

Yes they are :)

go-ipld-prime/typed/declaration/schema-schema.ipldsch

Line 397 in f2d5957

alias optional String

warpfork · 2019-05-02T08:39:00Z

So I went with a rule: where there is a default representation for a type (map for struct, string for enum, list for list, bytes for bytes etc.) as described in the schema spec PR, don't include it if it doesn't add anything useful.

An alternative but very similar rule we could apply is "for those that don't yet have more than one representation, we don't store a field for it".

which fields on types are actually present in the schema-schema and whether those fields are optional / have default values.

Continuing this train of thought... I'm not sure which ideas are going to be best future-proofed and consistent here.

If we use the second rule (and "representation" fields in the schema schema are non-optional when present, and otherwise absent entirely)... this might result in some weirdness down the road if we do introduce new representation strategies for kinds that don't currently have any: in order to be backwards compatible with existing IPLD-serialized schemas, the "representation" fields for those would now have to be 'optional' or have a 'default'.

So maybe it would be wise for our future consistency to bite the bullet and try to do everything with optional or default now?

(There's another concern to recurse into if we apply the 'default' feature here -- that there's currently no definition for how default values that aren't basic scalars should look, and we've got keyed unions for representations. But that's a bridge we should cross at some point anyway, I suppose.)

(A totally different way to handle the backwards compatibility in case of such a change in the schema-schema is of course to just start versioning the schema-schema, and the newer one has additional "representation" fields, and they're not optional nor have defaults... and we use the "schema try-stack" idea to probe any IPLD-serialized schema we're parsing; if it's missing one of the "representation" fields that's required in the newest version of the schema, the parser goes "welp" and falls back to the older schema version, and then we have code to do the migration semantically. Tada; this all works fine. It's a higher-firepower path though.)

rvagg · 2019-05-03T01:43:24Z

So maybe it would be wise for our future consistency to bite the bullet and try to do everything with optional or default now?

You're saying that maybe everything should have "representation":{...}, right? I'm fine with that for consistency sake, and to ensure the above point about encoded hash consistency. But it does blow out the size (in bytes) of the JSON form. And surely there's cases where that matters? It also introduces more human readability problems and it'd be nice if the JSON/IPLD/AST form was almost as easy to grok as the DSL.

Regarding future-proofing and making default representations not appear in the JSON: I'm not sure if that's a real problem as long as we retain the concept of a "default" representation into the future. Maybe we add like type StringyFloat float representation string, but that doesn't have to change the fact that the default representation for float is "float", you'd just be adding "representation":{"string":{}} to your StringyFloat and leaving normal float's without.

I think the two clear choices here are:

Don't include a "representation" block where the representation being used is the only for that type, or the default for that type and doesn't include additional options (like field customisations)
Include a "representation" for everything.

The in-between one of: only have a "representation" block for types that have more than one way of representing them has future proofing and consistency problems.

I'm fine with either 1 or 2, but for different reasons. 1 for efficiency and lack of verbosity (i.e. easier for humans to read) and 2 for consistency and ease of explaining it in a spec. I think you're going to have to generate a strong feeling either way to help this out.

warpfork · 2019-06-25T13:08:23Z

Okay, housecleaning day! :D I'm gonna land this by cherry-picking.

There's a couple parts of this diff that are definitely fixes of flat out typos; and I'm gonna cherry-pick and merge those.

There's a couple other parts that are probably good things to give some more consideration to and suss out; but I'd like to do that in another PR if that's alright with you (and ideally, after flipping these files over to the specs repo -- it might've been convenient to incubate them in this repo, but it's getting somewhat silly to let them reside here any further).

warpfork · 2019-06-25T13:13:37Z

(I do think you've probably got a point about the need to further clarify where representation fields should be present or not... but the schema for schemas should make that declaration, and be itself formatted correspondingly at the same time, and that's giving me enough of a headache that it's really really hard to deal with it as anything other than a completely standalone diff.)

Cherry-picked by eric with several changes. - The fixups of missing bits of schema-schema are a fair cop. - Several critical typos in the example.ipldsch.json file are fixed. - I tweaked the syntax for aliases in the example file; we've had the idea since this was originally posted of putting representational concerns that are clearly associated with a field inline with the fields for convenience, but in parens to distinguish that they're indeed representation concerns rather than type/cardinality concerns, so this is the expression of that idea. - I actually walked back the removal of representation blocks in the json. There's probably valid considerations to be made on that, but it deserves separate handling. (Disconcertingly, I actually added two *new* representation blocks that seem to previously have been missing due to typos.) Any new errors are presumably mine. Original: #10

warpfork · 2019-06-25T13:29:33Z

Cherry-picked in the above automatically referenced commit: 54ac969

Of course, this means this PR is now in "conflict" (since I took half of it!), so... Will now close this :)

Thanks for catching all these details! ❤️

Cherry-picked by eric with several changes. - The fixups of missing bits of schema-schema are a fair cop. - Several critical typos in the example.ipldsch.json file are fixed. - I tweaked the syntax for aliases in the example file; we've had the idea since this was originally posted of putting representational concerns that are clearly associated with a field inline with the fields for convenience, but in parens to distinguish that they're indeed representation concerns rather than type/cardinality concerns, so this is the expression of that idea. - I actually walked back the removal of representation blocks in the json. There's probably valid considerations to be made on that, but it deserves separate handling. (Disconcertingly, I actually added two *new* representation blocks that seem to previously have been missing due to typos.) Any new errors are presumably mine. Original: ipld/go-ipld-prime#10

schema fixups

348ebcc

warpfork reviewed May 2, 2019

View reviewed changes

warpfork mentioned this pull request May 2, 2019

Schema <> Block interaction + "advanced layouts logic" ipld/specs#118

Closed

warpfork closed this Jun 25, 2019

rvagg deleted the rvagg/schema-fixups branch June 26, 2019 05:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema fixups #10

schema fixups #10

rvagg commented May 2, 2019

warpfork commented May 2, 2019 •

edited

Loading

warpfork May 2, 2019

warpfork May 2, 2019

warpfork commented May 2, 2019

warpfork commented May 2, 2019

rvagg commented May 3, 2019

warpfork commented Jun 25, 2019

warpfork commented Jun 25, 2019

warpfork commented Jun 25, 2019

schema fixups #10

schema fixups #10

Conversation

rvagg commented May 2, 2019

warpfork commented May 2, 2019 • edited Loading

warpfork May 2, 2019

Choose a reason for hiding this comment

warpfork May 2, 2019

Choose a reason for hiding this comment

warpfork commented May 2, 2019

warpfork commented May 2, 2019

rvagg commented May 3, 2019

warpfork commented Jun 25, 2019

warpfork commented Jun 25, 2019

warpfork commented Jun 25, 2019

warpfork commented May 2, 2019 •

edited

Loading