Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

Add glossary entries for dialect and vocabulary. #484

Merged
merged 3 commits into from
Apr 26, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions learn/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,15 @@ If you encounter a term you wish were defined here, please feel free to [file an

The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others.

### dialect

A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this definition is too tightly coupled to the Vocabulary System. First of all, the vocabulary system likely to change in the future possibly making this definition no longer make sense. Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies. You can consider draft-07 as a single required vocabulary, but it's not defined that way.

I'd probably define a dialect as the set of keywords that are understood in a schema. Those keywords being defined by vocabularies is an artifact of the Vocabulary System and not necessarily a defining property of a dialect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I took this definition basically with only minor change from the spec:

A dialect is defined as a set of vocabularies and their required support identified in a meta-schema.

Obviously we can change this definition if/when we change it elsewhere, but are you still suggesting we do so beforehand? (Or have we already changed the definition elsewhere and I missed it?)

Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.

Interesting -- do you have a source for this? I'm not doubting you, I've just not heard it before but possibly just wasn't paying attention well enough -- are you saying we should define dialects this way or that others already do? I guess it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect, but equally well we could retroactively say the dialect is defined by a single vocabulary, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also occurred to me that this may be the case because we call $schema now a "dialect identifier" even though draft 7 did not have one, so retroactively we must call draft 7 a dialect.

Correct.

but equally well we could retroactively say the dialect is defined by a single vocabulary, no?

I don't know what this means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this means.

The spec defines dialects in the way I quoted. Draft 7 has a dialect identifier and is a dialect and yet has no defined vocabularies. What is the resolution to the apparent contradiction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary? Austin started talking about multiple vocabularies all the way back in draft-wright-*-00 (a.k.a. not-draft-05) in 2016:

Other specifications [besides JSON Schema Core] define the vocabularies that perform assertions about validation, linking, annotation, navigation, and interaction.

The concept of JSON Schema vocabularies had its own section in the spec in draft-07.

We just didn't settle on the exact number and granularity of them and formalize vocabulary identification and selection until 2019-09. But since 2016 we have always spoken of there being at least two vocabularies (core and validation) in the standard dialect, with hyper-schema as a third vocabulary.

Copy link
Member Author

@Julian Julian Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks, so you disagree (well, the spec you linked disagrees) with what Jason said then, i.e.

Second, releases such as draft-07 that pre-date the Vocabulary System are considered dialects as well, but aren't defined by vocabularies.

that's not true, they indeed are defined by vocabularies, in the form that existed at the time?

@Julian so does "reatroactively say the dialect is defined by a single vocabulary" refer to considering draft-07 to be a single vocabulary?

(Yes -- I wasn't aware and didn't check that the section you linked was present in draft 7, I took Jason's word for it).

What about draft 4, which doesn't mention the word vocabulary, though it's otherwise fairly similar structurally to draft 7, is it a dialect, and/or is https://json-schema.org/draft-04/schema its dialect URI?

All I'm trying to do is interpret definitions (and claims here) that are written down into a summary of the definition.

Copy link
Contributor

@handrews handrews Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian I think you're overthinking this a bit. We weren't exactly trying to harmonize the terms across all past drafts. Whether or not Jason's statement is true depends on whether he meant "vocabulary" in a general sense or "vocabulary" in a "thing you can put in $vocabulary" sense, which feels like splitting hairs to me.

Meta-schema URIs are (retroactively) dialect URIs. If you go back far enough (draft-04) a lot of things won't make sense, and we intentionally don't try to make it all make perfect sense. We don't even want people using draft-04 anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the definition of a dialect? Is it different from what's here?

Copy link
Member

@jdesrosiers jdesrosiers Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think we had used the word "vocabulary" before 2019-09, but I take @handrews's word for it. The thing I'm trying to avoid is confusion about the general concept of a vocabulary vs a vocabulary that's part of the vocabulary system. Readers should understand that this refers to the concept of vocabulary, not just the specific mechanism in the Vocabulary System.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian the term "dialect" emerged primarily over the course of a long series of discussions with OpenAPI, IIRC. I don't remember if we had started using it during 2019-09, but it really became a thing between 2019-09 and 2020-12. Of course, by the time 2020-12 went out I'd stepped away so I never got around to writing a formal definition into the spec.

I think I agree with @jdesrosiers that the concept of a vocabulary is more important than the mechanics, which are likely to change to some degree. I haven't really thought about that in terms of how we want to define these words. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the part about vocabularies potentially being optional is a Vocabulary System concept. I'd remove it.

Suggested change
A collection of [vocabularies](#vocabulary), along with an indication of whether supporting each vocabulary is required to process schemas written in the dialect.
A collection of [vocabularies](#vocabulary) that identify the set of keywords an implementation needs to understand to process a schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What definition of "Vocabulary System" are you using that results in you excluding optionality? Have you built consensus around it? The optionality of vocabularies is hardly an implementation detail. And again, it is part of what is real now.

Copy link
Member

@gregsdennis gregsdennis Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about taking @jdesrosiers's wording but using "should" instead of "needs to"? We're not writing in spec-ese here. I think "should" can suggest optionality without explicitly saying it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original wording was clear as plain language. Getting into "should" being a way to "suggest" something feels more spec-ese to me.


Dialects are identified by a URI, which [schemas](#schema) may then reference in their `$schema` [keyword](#keyword).
Doing so identifies the schema as being written in the dialect, and thereby indicates which keywords are usable within it, along with their intended meaning.

The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the best reference. That section has three meta-schemas where only two describe dialects. I'd also hesitate to point to meta-schemas as dialects. The spec defines the dialect, not the meta-schema. It would, however, make sense to reference a dialect URI because that identifies dialect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was definitely unhappy with pointing there -- my intention definitely wasn't to indicate meta schemas == dialects nor these are 3 dialects, but it's a bit of a reflection on how "widely" we use the dialect term that nowhere really on that page do we prominently say "this is the 2020 dialect page".

Changing this to just point to the whole page (without the fragment) though, I guess that's better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Julian we actually define the validation dialect in section 5 of the validation spec.

We haven't published a Hyper-Schema spec in a while so we don't have a current formal publication of that dialect. However, there is a section in the OpenAPI spec on dialects which could be referenced as an example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spec we say the metaschemas describe the dialects. We could use the same language here...

Suggested change
The JSON Schema specification [defines](https://json-schema.org/specification.html#general-purpose-meta-schema) a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case.
The JSON Schema specification defines a number of dialects, each of which enable vocabularies suitable for the dialect's specific use case. These are [described](https://json-schema.org/specification.html#general-purpose-meta-schema) in meta-schemas.


### draft

An individual release of the JSON Schema specification.
Expand Down Expand Up @@ -70,3 +79,19 @@ The rules constituting which schemas are conformant, as well as the rules govern
Strictly speaking, according to the specification, schemas are themselves JSON documents, though it is somewhat common for them to be authored or maintained in other languages which are easily translated to JSON, such as YAML.

In recent [drafts](#draft) of the specification, a schema is either a JSON object or a JSON boolean value.

### vocabulary

A collection of related [keywords](keyword), grouped to facilitate re-use.

A vocabulary is specified by a prose document or specification which explains the semantics of its keywords in a way suitable for implementers and users of the vocabulary.
It often also includes a [meta-schema](#meta-schema) (or multiple metaschemas) which define the syntax of its keywords.

Anyone can create and publish a vocabulary, and implementations generally will include facilities for extending themselves with support for additional vocabularies and their keywords.
The JSON Schema specification includes a number of vocabularies which cover each of the keywords it defines.

Vocabularies are identified by a URI which may be referenced via the `$vocabulary` keyword in order to enable the vocabulary within a [dialect](#dialect).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop this line as Vocabulary System specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I would not. Listing the $vocabulary keyword makes things more concrete and makes it easier for people to find out more. While it is part of the "vocabulary system", that is what is currently in use, and is more important than the very vague usage of these terms pre-2019-09.


#### See also

* [`json-schema-vocabularies`](https://github.com/json-schema-org/json-schema-vocabularies), a repository which collects known third-party JSON Schema vocabularies