-
-
Notifications
You must be signed in to change notification settings - Fork 215
Add glossary entries for dialect and vocabulary. #484
Conversation
✅ Deploy Preview for condescending-hopper-c3ed30 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
learn/glossary.md
Outdated
@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an | |||
|
|||
The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others. | |||
|
|||
### dialect | |||
|
|||
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this definition is too tightly coupled to the Vocabulary System. First of all, the vocabulary system likely to change in the future possibly making this definition no longer make sense. Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies. You can consider draft-07 as a single required vocabulary, but it's not defined that way.
I'd probably define a dialect as the set of keywords that are understood in a schema. Those keywords being defined by vocabularies is an artifact of the Vocabulary System and not necessarily a defining property of a dialect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I took this definition basically with only minor change from the spec:
A dialect is defined as a set of vocabularies and their required support identified in a meta-schema.
Obviously we can change this definition if/when we change it elsewhere, but are you still suggesting we do so beforehand? (Or have we already changed the definition elsewhere and I missed it?)
Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.
Interesting -- do you have a source for this? I'm not doubting you, I've just not heard it before but possibly just wasn't paying attention well enough -- are you saying we should define dialects this way or that others already do? I guess it also occurred to me that this may be the case because we call $schema
now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect, but equally well we could retroactively say the dialect is defined by a single vocabulary, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect.
Correct.
but equally well we could retroactively say the dialect is defined by a single vocabulary, no?
I don't know what this means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what this means.
The spec defines dialects in the way I quoted. Draft 7 has a dialect identifier and is a dialect and yet has no defined vocabularies. What is the resolution to the apparent contradiction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary? Austin started talking about multiple vocabularies all the way back in draft-wright-*-00 (a.k.a. not-draft-05) in 2016:
Other specifications [besides JSON Schema Core] define the vocabularies that perform assertions about validation, linking, annotation, navigation, and interaction.
The concept of JSON Schema vocabularies had its own section in the spec in draft-07.
We just didn't settle on the exact number and granularity of them and formalize vocabulary identification and selection until 2019-09. But since 2016 we have always spoken of there being at least two vocabularies (core and validation) in the standard dialect, with hyper-schema as a third vocabulary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks, so you disagree (well, the spec you linked disagrees) with what Jason said then, i.e.
Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.
that's not true, they indeed are defined by vocabularies, in the form that existed at the time?
@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary?
(Yes -- I wasn't aware and didn't check that the section you linked was present in draft 7, I took Jason's word for it).
What about draft 4, which doesn't mention the word vocabulary, though it's otherwise fairly similar structurally to draft 7, is it a dialect, and/or is https://json-schema.org/draft-04/schema its dialect URI?
All I'm trying to do is interpret definitions (and claims here) that are written down into a summary of the definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Julian I think you're overthinking this a bit. We weren't exactly trying to harmonize the terms across all past drafts. Whether or not Jason's statement is true depends on whether he meant "vocabulary" in a general sense or "vocabulary" in a "thing you can put in $vocabulary
" sense, which feels like splitting hairs to me.
Meta-schema URIs are (retroactively) dialect URIs. If you go back far enough (draft-04) a lot of things won't make sense, and we intentionally don't try to make it all make perfect sense. We don't even want people using draft-04 anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the definition of a dialect? Is it different from what's here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think we had used the word "vocabulary" before 2019-09, but I take @handrews's word for it. The thing I'm trying to avoid is confusion about the general concept of a vocabulary vs a vocabulary that's part of the vocabulary system. Readers should understand that this refers to the concept of vocabulary, not just the specific mechanism in the Vocabulary System.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Julian the term "dialect" emerged primarily over the course of a long series of discussions with OpenAPI, IIRC. I don't remember if we had started using it during 2019-09, but it really became a thing between 2019-09 and 2020-12. Of course, by the time 2020-12 went out I'd stepped away so I never got around to writing a formal definition into the spec.
I think I agree with @jdesrosiers that the concept of a vocabulary is more important than the mechanics, which are likely to change to some degree. I haven't really thought about that in terms of how we want to define these words. 🤔
learn/glossary.md
Outdated
@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an | |||
|
|||
The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others. | |||
|
|||
### dialect | |||
|
|||
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect.
Correct.
but equally well we could retroactively say the dialect is defined by a single vocabulary, no?
I don't know what this means.
learn/glossary.md
Outdated
|
||
A collection of related [keywords](keyword), grouped to facilitate re-use. | ||
|
||
A vocabulary typically includes both a [meta-schema](#meta-schema) which formally defines the keywords it contains, as well as a prose document or specification which explains the semantics of its keywords in a way suitable for implementers and users of the vocabulary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite true. The formal definition of what keywords a vocabulary contains is its specification, whatever form that takes. We will eventually have a machine-readable format for this. The meta-schema describe's the vocabulary's intended syntax, but you can have different meta-schemas for the same vocabulary. You usually wouldn't- if you want to tweak usage by, for example, forbidding the use of the keyword not
, you'd probably do that in the dialect meta-schema.
But the important thing here is that vocabulary/keyword semantics are defined by the vocabulary specification. Meta-schemas, either vocabulary or dialect-level, describe the valid syntax. These two things are intentionally separated and able to vary somewhat independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, will tweak.
Given the reviews and discussion points towards wanting to reframe how these words are defined, without the context of "the vocabulary system" as it's likely to change a bit, maybe we should close this PR in favour of (creating a new issue to cover these definitions and then) opening a new PR? |
Happy to close it -- it's odd to me to not document terms we already use because we want to redefine them, but if y'all feel that's the right way to move forward sure. |
@Relequestual I have to say I agree with @Julian that it would make more sense to document these as-is. We don't know how long it will be before the next release is out, and given the lack of consensus around where these terms are going, we don't know if that release will redefine anything or just label the existing features as not-yet-stable. None of which helps people looking for definitions now. Glancing back over the discussion I thought it was going in a productive direction. |
I don't think this discussion was about redefining what "dialect" means, it was about presenting it in an accurate and accessible way regardless of what words are used in the spec. In the definition provided in the spec, the word "vocabulary" can be interpreted in two ways. One way is arguably inaccurate and the other is inaccessible due to the mental gymnastics necessary to draw the appropriate conclusions. Either way, the glossary definition needs to be clear about how "vocabulary" should be interpreted or use a different word entirely. If you interpret "vocabulary" as a Vocabulary System vocabulary, the definition is not inclusive of versions of JSON Schema that pre-date the Vocabulary System. For years we have been discussing any distinct flavor of JSON Schema as a dialect including past drafts and third-party flavors such as those used by OpenAPI 3.0 and MongoDB. Therefore, the definition in the spec doesn't reflect common usage and there should at least be wording in addition to the spec definition that covers this usage. However, if you interpret "vocabulary" as the general concept of a vocabulary independent of the Vocabulary System, the spec definition isn't inaccurate, but you have to mentally jump through some hoops to recognize older drafts and third-party flavors as implied single vocabulary dialects. A glossary definition shouldn't require that much effort. I'm definitely in favor of having something now, but I think the way it's currently written up is in terms of the Vocabulary System and I think that's an incomplete depiction of how the term "dialect" is commonly used. |
OK sounds like both of you are indeed happy with writing something down, though I don't feel I was able to get any specific suggestions about what definition we actually use today in conversation -- I'm happy to reopen and give it another shot myself I suppose. |
I can try to remember to take a look at it this/next week sometime. I had a lot going on in Sept and Oct and I think I just forgot. |
Words change over time. We started talking about each section of the spec in the spec as a "vocabulary" before While the definition of "vocabulary" need not mention |
I don' t know what you're trying to say here. If I, as an expert, found this ambiguous, then it's definitely going to be ambiguous to casual users who are the target audience of this glossary. I think we agreed that these terms should be defined in a generic way that isn't coupled to the vocabulary system. Currently, the wording used here sounds very much like a description of the vocabulary system. I'm only advocating that the wording be tweaked to reduce ambiguity. |
@jdesrosiers the goal of these definitions should be to help people understand what these things are now, and how to use them now. Not to future-proof against possible later changes that we've not even agreed to do yet, much less agreed on their form. So yes, |
That's not the way I'm looking at it. It's not a matter of future-proof anything, it's a matter of scope. I think the glossary should describe the concept, and UJS should describe the mechanism. As you say, the concept of a vocabulary pre-dates the vocabulary system and the introduction of the vocabulary system didn't change that concept. The concept of a dialect has existed prior to the vocabulary system as well although we didn't have a name for the concept until later. |
Mentioning The concepts of vocabularies and dialects are far more important as they exist in 2019-09 and later. They don't have much useful meaning prior to that, and trying to span all of those usages just muddies both the concept and the usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Julian, Here are the specific suggestions you requested.
learn/glossary.md
Outdated
@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an | |||
|
|||
The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others. | |||
|
|||
### dialect | |||
|
|||
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the part about vocabularies potentially being optional is a Vocabulary System concept. I'd remove it.
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect. | |
A collection of [vocabularies](#vocabulary) that identify the set of keywords an implementation needs to understand to process a schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What definition of "Vocabulary System" are you using that results in you excluding optionality? Have you built consensus around it? The optionality of vocabularies is hardly an implementation detail. And again, it is part of what is real now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about taking @jdesrosiers's wording but using "should" instead of "needs to"? We're not writing in spec-ese here. I think "should" can suggest optionality without explicitly saying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original wording was clear as plain language. Getting into "should" being a way to "suggest" something feels more spec-ese to me.
learn/glossary.md
Outdated
Dialects are identified by a URI, which [schemas](#schema) may then reference in their `$schema` [keyword](#keyword). | ||
Doing so identifies the schema as being written in the dialect, and thereby indicates which keywords are usable within it, along with their intended meaning. | ||
|
||
The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the best reference. That section has three meta-schemas where only two describe dialects. I'd also hesitate to point to meta-schemas as dialects. The spec defines the dialect, not the meta-schema. It would, however, make sense to reference a dialect URI because that identifies dialect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was definitely unhappy with pointing there -- my intention definitely wasn't to indicate meta schemas == dialects
nor these are 3 dialects
, but it's a bit of a reflection on how "widely" we use the dialect term that nowhere really on that page do we prominently say "this is the 2020 dialect page".
Changing this to just point to the whole page (without the fragment) though, I guess that's better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Julian we actually define the validation dialect in section 5 of the validation spec.
We haven't published a Hyper-Schema spec in a while so we don't have a current formal publication of that dialect. However, there is a section in the OpenAPI spec on dialects which could be referenced as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spec we say the metaschemas describe the dialects. We could use the same language here...
The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case. | |
The JSON Schema specification defines a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case. These are [described](https://json-schema.org/specification.html#general-purpose-meta-schema) in meta-schemas. |
learn/glossary.md
Outdated
Anyone can create and publish a vocabulary, and implementations generally will include facilities for extending themselves with support for additional vocabularies and their keywords. | ||
The JSON Schema specification includes a number of vocabularies which cover each of the keywords it defines. | ||
|
||
Vocabularies are identified by a URI which may be referenced via the `$vocabulary` keyword in order to enable the vocabulary within a [dialect](#dialect). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd drop this line as Vocabulary System specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I would not. Listing the $vocabulary
keyword makes things more concrete and makes it easier for people to find out more. While it is part of the "vocabulary system", that is what is currently in use, and is more important than the very vague usage of these terms pre-2019-09.
I don't feel strongly enough about this debate it any further. @Julian, I've given you my thoughts and I support whatever direction you want to go with this PR. |
All good sorry was working on some other things today so didn't chime in yet but very much appreciate the suggestions will see what you've got and I'm sure get something workable! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made one suggestion for change, but I am otherwise happy.
I think it's useful to remember this is to document how things CURRENTLY are. How they might be later generally should have no bearing.
I've given one last shot at trying to make everyone happy here -- I think folks mostly were on board here at least with "disagree but ok"'ing this change -- given I've just tweaked it again I'll give a bit of time (days) to see if anyone pipes up again, and if not will rely on the previous OK's (and always fix something later if we care to). |
learn/glossary.md
Outdated
|
||
### vocabulary | ||
|
||
A tightly related collection of related [keywords](keyword), grouped to facilitate re-use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tightly related collection of related [keywords](keyword), grouped to facilitate re-use. | |
A tightly related collection of [keywords](keyword), grouped to facilitate re-use. |
or
A tightly related collection of related [keywords](keyword), grouped to facilitate re-use. | |
A collection of tightly related [keywords](keyword), grouped to facilitate re-use. |
Maybe reduce redundancy a little?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this. Just the one grammatical suggestion.
Even though this isn't fully consistent with the current spec definitions, it matches colloquial uses, and perhaps plans for future changes to these two concepts, without sacrificing accuracy today.
a0611aa
to
2f0a2f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks both! |
Brings the glossary back in sync.
Refs: json-schema-org/community#199