Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

Add glossary entries for dialect and vocabulary. #484

Merged
merged 3 commits into from
Apr 26, 2023
Merged

Conversation

Julian
Copy link
Member

@Julian Julian commented Oct 11, 2022

Refs: json-schema-org/community#199

@netlify
Copy link

netlify bot commented Oct 11, 2022

Deploy Preview for condescending-hopper-c3ed30 ready!

Name Link
🔨 Latest commit 2f0a2f8
🔍 Latest deploy log https://app.netlify.com/sites/condescending-hopper-c3ed30/deploys/64499286c407ef0007f2646b
😎 Deploy Preview https://deploy-preview-484--condescending-hopper-c3ed30.netlify.app/learn/glossary
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an

The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others.

### dialect

A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this definition is too tightly coupled to the Vocabulary System. First of all, the vocabulary system likely to change in the future possibly making this definition no longer make sense. Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies. You can consider draft-07 as a single required vocabulary, but it's not defined that way.

I'd probably define a dialect as the set of keywords that are understood in a schema. Those keywords being defined by vocabularies is an artifact of the Vocabulary System and not necessarily a defining property of a dialect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I took this definition basically with only minor change from the spec:

A dialect is defined as a set of vocabularies and their required support identified in a meta-schema.

Obviously we can change this definition if/when we change it elsewhere, but are you still suggesting we do so beforehand? (Or have we already changed the definition elsewhere and I missed it?)

Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.

Interesting -- do you have a source for this? I'm not doubting you, I've just not heard it before but possibly just wasn't paying attention well enough -- are you saying we should define dialects this way or that others already do? I guess it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect, but equally well we could retroactively say the dialect is defined by a single vocabulary, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect.

Correct.

but equally well we could retroactively say the dialect is defined by a single vocabulary, no?

I don't know what this means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this means.

The spec defines dialects in the way I quoted. Draft 7 has a dialect identifier and is a dialect and yet has no defined vocabularies. What is the resolution to the apparent contradiction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary? Austin started talking about multiple vocabularies all the way back in draft-wright-*-00 (a.k.a. not-draft-05) in 2016:

Other specifications [besides JSON Schema Core] define the vocabularies that perform assertions about validation, linking, annotation, navigation, and interaction.

The concept of JSON Schema vocabularies had its own section in the spec in draft-07.

We just didn't settle on the exact number and granularity of them and formalize vocabulary identification and selection until 2019-09. But since 2016 we have always spoken of there being at least two vocabularies (core and validation) in the standard dialect, with hyper-schema as a third vocabulary.

Copy link
Member Author

@Julian Julian Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks, so you disagree (well, the spec you linked disagrees) with what Jason said then, i.e.

Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.

that's not true, they indeed are defined by vocabularies, in the form that existed at the time?

@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary?

(Yes -- I wasn't aware and didn't check that the section you linked was present in draft 7, I took Jason's word for it).

What about draft 4, which doesn't mention the word vocabulary, though it's otherwise fairly similar structurally to draft 7, is it a dialect, and/or is https://json-schema.org/draft-04/schema its dialect URI?

All I'm trying to do is interpret definitions (and claims here) that are written down into a summary of the definition.

Copy link
Contributor

@handrews handrews Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian I think you're overthinking this a bit. We weren't exactly trying to harmonize the terms across all past drafts. Whether or not Jason's statement is true depends on whether he meant "vocabulary" in a general sense or "vocabulary" in a "thing you can put in $vocabulary" sense, which feels like splitting hairs to me.

Meta-schema URIs are (retroactively) dialect URIs. If you go back far enough (draft-04) a lot of things won't make sense, and we intentionally don't try to make it all make perfect sense. We don't even want people using draft-04 anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the definition of a dialect? Is it different from what's here?

Copy link
Member

@jdesrosiers jdesrosiers Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think we had used the word "vocabulary" before 2019-09, but I take @handrews's word for it. The thing I'm trying to avoid is confusion about the general concept of a vocabulary vs a vocabulary that's part of the vocabulary system. Readers should understand that this refers to the concept of vocabulary, not just the specific mechanism in the Vocabulary System.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian the term "dialect" emerged primarily over the course of a long series of discussions with OpenAPI, IIRC. I don't remember if we had started using it during 2019-09, but it really became a thing between 2019-09 and 2020-12. Of course, by the time 2020-12 went out I'd stepped away so I never got around to writing a formal definition into the spec.

I think I agree with @jdesrosiers that the concept of a vocabulary is more important than the mechanics, which are likely to change to some degree. I haven't really thought about that in terms of how we want to define these words. 🤔

@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an

The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others.

### dialect

A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect.

Correct.

but equally well we could retroactively say the dialect is defined by a single vocabulary, no?

I don't know what this means.


A collection of related [keywords](keyword), grouped to facilitate re-use.

A vocabulary typically includes both a [meta-schema](#meta-schema) which formally defines the keywords it contains, as well as a prose document or specification which explains the semantics of its keywords in a way suitable for implementers and users of the vocabulary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite true. The formal definition of what keywords a vocabulary contains is its specification, whatever form that takes. We will eventually have a machine-readable format for this. The meta-schema describe's the vocabulary's intended syntax, but you can have different meta-schemas for the same vocabulary. You usually wouldn't- if you want to tweak usage by, for example, forbidding the use of the keyword not, you'd probably do that in the dialect meta-schema.

But the important thing here is that vocabulary/keyword semantics are defined by the vocabulary specification. Meta-schemas, either vocabulary or dialect-level, describe the valid syntax. These two things are intentionally separated and able to vary somewhat independently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will tweak.

@Relequestual
Copy link
Member

Given the reviews and discussion points towards wanting to reframe how these words are defined, without the context of "the vocabulary system" as it's likely to change a bit, maybe we should close this PR in favour of (creating a new issue to cover these definitions and then) opening a new PR?

@Julian
Copy link
Member Author

Julian commented Nov 3, 2022

Happy to close it -- it's odd to me to not document terms we already use because we want to redefine them, but if y'all feel that's the right way to move forward sure.

@Julian Julian closed this Nov 3, 2022
@handrews
Copy link
Contributor

handrews commented Nov 3, 2022

@Relequestual I have to say I agree with @Julian that it would make more sense to document these as-is. We don't know how long it will be before the next release is out, and given the lack of consensus around where these terms are going, we don't know if that release will redefine anything or just label the existing features as not-yet-stable. None of which helps people looking for definitions now. Glancing back over the discussion I thought it was going in a productive direction.

@jdesrosiers
Copy link
Member

I don't think this discussion was about redefining what "dialect" means, it was about presenting it in an accurate and accessible way regardless of what words are used in the spec. In the definition provided in the spec, the word "vocabulary" can be interpreted in two ways. One way is arguably inaccurate and the other is inaccessible due to the mental gymnastics necessary to draw the appropriate conclusions. Either way, the glossary definition needs to be clear about how "vocabulary" should be interpreted or use a different word entirely.

If you interpret "vocabulary" as a Vocabulary System vocabulary, the definition is not inclusive of versions of JSON Schema that pre-date the Vocabulary System. For years we have been discussing any distinct flavor of JSON Schema as a dialect including past drafts and third-party flavors such as those used by OpenAPI 3.0 and MongoDB. Therefore, the definition in the spec doesn't reflect common usage and there should at least be wording in addition to the spec definition that covers this usage.

However, if you interpret "vocabulary" as the general concept of a vocabulary independent of the Vocabulary System, the spec definition isn't inaccurate, but you have to mentally jump through some hoops to recognize older drafts and third-party flavors as implied single vocabulary dialects. A glossary definition shouldn't require that much effort.

I'm definitely in favor of having something now, but I think the way it's currently written up is in terms of the Vocabulary System and I think that's an incomplete depiction of how the term "dialect" is commonly used.

@Julian
Copy link
Member Author

Julian commented Nov 3, 2022

OK sounds like both of you are indeed happy with writing something down, though I don't feel I was able to get any specific suggestions about what definition we actually use today in conversation -- I'm happy to reopen and give it another shot myself I suppose.

@Julian Julian reopened this Nov 3, 2022
@handrews
Copy link
Contributor

handrews commented Nov 3, 2022

I don't feel I was able to get any specific suggestions about what definition we actually use today in conversation -- I'm happy to reopen and give it another shot myself I suppose.

I can try to remember to take a look at it this/next week sometime. I had a lot going on in Sept and Oct and I think I just forgot.

@handrews
Copy link
Contributor

handrews commented Nov 3, 2022

If you interpret "vocabulary" as a Vocabulary System vocabulary, the definition is not inclusive of versions of JSON Schema that pre-date the Vocabulary System.

Words change over time. We started talking about each section of the spec in the spec as a "vocabulary" before $vocabulary was around. We do not need to accommodate every way that anyone has ever used the word (or the word "dialect").

While the definition of "vocabulary" need not mention $vocabulary, it does need to be compatible with the current usage of the spec.

@jdesrosiers
Copy link
Member

We do not need to accommodate every way that anyone has ever used the word (or the word "dialect").

I don' t know what you're trying to say here. If I, as an expert, found this ambiguous, then it's definitely going to be ambiguous to casual users who are the target audience of this glossary. I think we agreed that these terms should be defined in a generic way that isn't coupled to the vocabulary system. Currently, the wording used here sounds very much like a description of the vocabulary system. I'm only advocating that the wording be tweaked to reduce ambiguity.

@handrews
Copy link
Contributor

handrews commented Nov 7, 2022

@jdesrosiers the goal of these definitions should be to help people understand what these things are now, and how to use them now. Not to future-proof against possible later changes that we've not even agreed to do yet, much less agreed on their form. So yes, $vocabulary should be mentioned because that's how things work now. If there needs to be language about it being specific to 2019-09 and later, that's fine. But reading the glossary should make it clear what to look at in the spec or other documentation.

@jdesrosiers
Copy link
Member

That's not the way I'm looking at it. It's not a matter of future-proof anything, it's a matter of scope. I think the glossary should describe the concept, and UJS should describe the mechanism. As you say, the concept of a vocabulary pre-dates the vocabulary system and the introduction of the vocabulary system didn't change that concept. The concept of a dialect has existed prior to the vocabulary system as well although we didn't have a name for the concept until later.

@handrews
Copy link
Contributor

handrews commented Nov 7, 2022

Mentioning $vocabulary is hardly getting stuck in the weeds. Not everyone reads UJS. I do not agree with your push to remove information from other sources in favor of UJS.

The concepts of vocabularies and dialects are far more important as they exist in 2019-09 and later. They don't have much useful meaning prior to that, and trying to span all of those usages just muddies both the concept and the usage.

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian, Here are the specific suggestions you requested.

@@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an

The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others.

### dialect

A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the part about vocabularies potentially being optional is a Vocabulary System concept. I'd remove it.

Suggested change
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
A collection of [vocabularies](#vocabulary) that identify the set of keywords an implementation needs to understand to process a schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What definition of "Vocabulary System" are you using that results in you excluding optionality? Have you built consensus around it? The optionality of vocabularies is hardly an implementation detail. And again, it is part of what is real now.

Copy link
Member

@gregsdennis gregsdennis Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about taking @jdesrosiers's wording but using "should" instead of "needs to"? We're not writing in spec-ese here. I think "should" can suggest optionality without explicitly saying it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original wording was clear as plain language. Getting into "should" being a way to "suggest" something feels more spec-ese to me.

Dialects are identified by a URI, which [schemas](#schema) may then reference in their `$schema` [keyword](#keyword).
Doing so identifies the schema as being written in the dialect, and thereby indicates which keywords are usable within it, along with their intended meaning.

The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the best reference. That section has three meta-schemas where only two describe dialects. I'd also hesitate to point to meta-schemas as dialects. The spec defines the dialect, not the meta-schema. It would, however, make sense to reference a dialect URI because that identifies dialect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was definitely unhappy with pointing there -- my intention definitely wasn't to indicate meta schemas == dialects nor these are 3 dialects, but it's a bit of a reflection on how "widely" we use the dialect term that nowhere really on that page do we prominently say "this is the 2020 dialect page".

Changing this to just point to the whole page (without the fragment) though, I guess that's better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian we actually define the validation dialect in section 5 of the validation spec.

We haven't published a Hyper-Schema spec in a while so we don't have a current formal publication of that dialect. However, there is a section in the OpenAPI spec on dialects which could be referenced as an example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spec we say the metaschemas describe the dialects. We could use the same language here...

Suggested change
The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case.
The JSON Schema specification defines a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case. These are [described](https://json-schema.org/specification.html#general-purpose-meta-schema) in meta-schemas.

Anyone can create and publish a vocabulary, and implementations generally will include facilities for extending themselves with support for additional vocabularies and their keywords.
The JSON Schema specification includes a number of vocabularies which cover each of the keywords it defines.

Vocabularies are identified by a URI which may be referenced via the `$vocabulary` keyword in order to enable the vocabulary within a [dialect](#dialect).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop this line as Vocabulary System specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I would not. Listing the $vocabulary keyword makes things more concrete and makes it easier for people to find out more. While it is part of the "vocabulary system", that is what is currently in use, and is more important than the very vague usage of these terms pre-2019-09.

@jdesrosiers
Copy link
Member

I don't feel strongly enough about this debate it any further. @Julian, I've given you my thoughts and I support whatever direction you want to go with this PR.

@Julian
Copy link
Member Author

Julian commented Nov 7, 2022

All good sorry was working on some other things today so didn't chime in yet but very much appreciate the suggestions will see what you've got and I'm sure get something workable!

Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made one suggestion for change, but I am otherwise happy.

I think it's useful to remember this is to document how things CURRENTLY are. How they might be later generally should have no bearing.

@Julian
Copy link
Member Author

Julian commented Apr 26, 2023

I've given one last shot at trying to make everyone happy here -- I think folks mostly were on board here at least with "disagree but ok"'ing this change -- given I've just tweaked it again I'll give a bit of time (days) to see if anyone pipes up again, and if not will rely on the previous OK's (and always fix something later if we care to).


### vocabulary

A tightly related collection of related [keywords](keyword), grouped to facilitate re-use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A tightly related collection of related [keywords](keyword), grouped to facilitate re-use.
A tightly related collection of [keywords](keyword), grouped to facilitate re-use.

or

Suggested change
A tightly related collection of related [keywords](keyword), grouped to facilitate re-use.
A collection of tightly related [keywords](keyword), grouped to facilitate re-use.

Maybe reduce redundancy a little?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

Copy link
Member

@gregsdennis gregsdennis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this. Just the one grammatical suggestion.

Even though this isn't fully consistent with the current spec definitions, it
matches colloquial uses, and perhaps plans for future changes to these two
concepts, without sacrificing accuracy today.
@Julian Julian force-pushed the dialect-vocabulary branch from a0611aa to 2f0a2f8 Compare April 26, 2023 21:07
@Julian Julian linked an issue Apr 26, 2023 that may be closed by this pull request
@Julian Julian linked an issue Apr 26, 2023 that may be closed by this pull request
Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Julian
Copy link
Member Author

Julian commented Apr 26, 2023

Thanks both!

@Julian Julian merged commit 9b0683d into main Apr 26, 2023
@Julian Julian deleted the dialect-vocabulary branch April 26, 2023 23:40
Julian added a commit to json-schema-org/website that referenced this pull request Apr 27, 2023
Brings the glossary back in sync.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a glossary entry for vocabulary Should the glossary rename "draft" to "dialect"?
5 participants