-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"$id" as an indicator of embedded documents #719
Comments
@handrews thanks for writing this up! This is a bit different than what I had in mind, but the important bits are there. There's one problem however. {
"type": "object",
"properties": {
"aaa": { "$ref": "/common#/definitions/twentieth-century-year" }
}
} {
"type": "object",
"properties": {
"aaa": {
"$id": "/common#/definitions/twentieth-century-year",
"definitions": {
"year": { "type": "string", "pattern": "^\\d{4}$" },
"twentieth-century-year": { "$ref": "#/definitions/year", "pattern": "^19" }
}
}
}
} In the second example, the document ( |
Sort of. Having {
"type": "object",
"properties": {
"name": {
"$id": "/person-name",
"$anchor": "name",
"type": "string"
},
"age": {
"$anchor": "age",
"type": "number"
}
}
} #/properties/name == /person-name == /person-name#name #/properties/name != #name #/properties/age == #age |
@jdesrosiers thanks! I'll start with the I agree with you that there's not much point to the use case. I brought it up for two reasons:
The relevant behavior can be demonstrated with: {
"$id": "/person-name",
"$anchor": "name",
"type": "string",
"additionalProperties": {
"$ref": "#name"
}
} which is identified by all of: The use of A As for why someone might do that... I can't construct a case where I would, but it doesn't hurt anything. More importantly, it's almost certainly harder to carve out an exception for Lots of things in JSON Schema are not useful (e.g. So I don't see any good reason for preventing it. It's straightforward behavior. |
That I agree that there is no reason to make this illegal. It causes no conflicts or contradictions. It's just not useful. It's something for a linter to call your attention to. Nothing more. |
OK taking a closer look at this example: {
"type": "object",
"properties": {
"aaa": {
"$id": "/common#/definitions/twentieth-century-year",
"definitions": {
"year": { "type": "string", "pattern": "^\\d{4}$" },
"twentieth-century-year": { "$ref": "#/definitions/year", "pattern": "^19" }
}
}
}
} It looks like rather than the Is that correct? If so, while I understand your desire for I'll stop here to wait for a reply in case I am way off base. |
Yes, that's correct. And, yes, it's not going to be immediately clear to a lay person what's going on here. But, all you have to understand is the concepts of document and document-value and you understand how everything works. From the schema to However, I'm not too worried about how human readable this is because embedding documents should be the domain of tooling. Schema authors wouldn't write something like that by hand. It's easy for programs to understand and that's the important thing for this feature. Furthermore, in the use-cases where schema authors would want to use |
@jdesrosiers OK so I see where you're coming from with this. However, I don't see this approach as directly viable with JSON Schema. But we may be able to end up with compatible behavior. Regarding:
Well... it's syntactically backwards compatible. And it's technically semantically backwards compatible because it relies on a syntax with explicitly undefined semantics in recent drafts. But, technically, terminating the process, printing a zen koan to stdout, or playing Beethoven's 9th over the speakers are all equally technically compatible 😛 I do believe that it violates the spirit of the semantics of And it absolutely violates the way most implementations would treat this, which is to ignore the fragment, or maybe use it to calculate JSON Pointer fragments for subschemas, which is one possible interpretation although not directly supported by either RFC 3986 or RFC 6901. I'm not aware of anyone that interprets a JSON Pointer fragment in Put this together with the fact that JSON Schema already deviates from a generic So, if (in JSON Schema), we completely forbid fragments in
Note that this only goes in one direction. It is not the case that every AFAICT, this means that the behavior that would be allowed in JSON Schema is essentially identical to what your system supports. But there would be substantial behavior in your system that is not allowed in JSON Schema. Does this make sense? Does it seem like a reasonable way forward? |
To summarize what I see as the benefits:
@epoberezkin you put a frowny face on this. Do you have any clear feedback on the idea? Otherwise, I am disregarding vague unspecified disapproval. @Julian @gregsdennis @johandorland @erosb @KayEss input from implementors would be most welcome on this, given that Based on my past experience implementing |
It sounds positive to me.
This bit I'm not so sure I understand. The proposal isn't that |
There's no contradiction. There are two concepts: document and value. The fragment part of a URI does not identify a document.
How a fragment is interpreted is defined by the media type and we define the media type, so no problem there. HTML does the same, so we'd be in good company. This isn't even new behavior for JSON Schema. This behavior is already defined for
Huh? I wouldn't expect anyone to have implemented this change before it was proposed.
That's too bad. Embedding a document is far more useful than extracting an embedded document.
I think that's accurate. However, I see no benefit of the no fragment constraint. It makes things more complicated and less powerful. (This section is a bit of a tangent. If we want to discuss this further, I suggest we take it to slack)
|
@KayEss Thanks for your feedback.
Good point! I hadn't thought of that.
No,
Very little of how it works would change. It's more a change in the way we think about it. Any part of the JSON Schema with an
Yes. I see no reason to change this. |
I realized that we keep talking about my proposal vs the slightly modified proposal @handrews has written up here. So, for those who missed it on Slack, here is how I introduced it. I've been working on a generic browser concept based on JSON Reference. It's still very early stages, but I think the model I came up with is a candidate for a solution to the issues JSON Schema has with I've found this model easy and efficient to implement. It has strong parallels to existing web constructs. It simplifies the concepts without loosing anything of value. One of the goals of this model is to fully decouple JSON Pointer, JSON Reference, and JSON Schema. Each can be implemented independently of one another. I wrote a JSON Schema-ish validation proof of concept that builds on JSON Reference (rather than JSON). This implementation has full support for JSON Reference for JSON Schema ImplementorsThe features of JSON Reference are very similar to the features of Documents vs ValuesAll JSON Reference documents have a "value". The fragment part of the document's If the fragment is empty, the value of the document is the whole document. If the fragment starts with a If the fragment is not a JSON Pointer, then it's an anchor fragment. The The value of a document whose URI fragment does not point to a valid part of the
|
@KayEss my intention was that The exact syntax is specified in section 8.2.3 of the core spec. The meta-schema for {
"type": "string",
"pattern": "^[A-Z][A-Za-z0-9_.:-]*$"
} Which is not I18N friendly and we should perhaps address that, but that's what we would get from the currently defined syntax. |
Right. You were, I thought, claiming compatibility with the current
This does not hold up.
That is not my goal, and I do not think it is a good goal for JSON Schema. I think it's a great separate project, hence my effort to retain compatibility with it. But it makes schemas much more difficult to reason about. My goal is to simplify
I strongly feel that it is the exact opposite. @KayEss noting that fragments are a special case in their implementation would seem to support that. I don't want this to be another big back and forth between these two proposals. This project has seen enough of that. But I also want people to be able to find this more easily than by digging through slack history. @jdesrosiers I suggest that we close this, and i'll re-file mine, and you can re-file yours. Each can be discussed on its own merits, and if either garners sufficient support, we'll add it to the spec (or create a separate spec, I suppose). |
Firstly, it would be good to screen this idea against what problem we are trying to solve here? I couldn’t understand it. Once we understand the problem, we can see whether it can be solved with the existing vocabulary or any extension is needed. Secondly, I am not sure why $anchor is needed if you already can use $id as an anchor, according to the current spec. |
@epoberezkin the problem is that people constantly complain about how complicated If you don't agree with that problem, that's fine, I'm not interested in convincing you. |
Right. I thought this proposal is about actually making $id more complex... Maybe the fact that there are two proposals is confusing. |
@handrews I wrote that I don’t understand the problem. I may agree or disagree with the solution, but if anybody has a problem it is a fact. Whether the problem should be solved is another matter entirely. There is a beautiful post by @kellan on screening new tech - it definitely applies to how all solution ideas should be screened to avoid feature bloat: |
Yeah, that's why I figure we should close this and re-file separately. To give a less snarky answer to your prior question, I'm pretty sure @jdesrosiers and I are attempting to solve different problems. So the solutions don't compare well anyway. If you (and others) don't think the problem I see is a real problem, then obviously the solution is not compelling and we won't do it. I'm not really interested in selling anyone on the problem- you either see it or you don't, and that part is more interesting to me than convincing people how to look at it. |
@handrews honestly, all I am saying that it would be good to understand the problem. All problems are real, it’s not for me to judge. If and how they should be solved is another question. I cannot reason about the solution if I don’t understand the problem. You may have discussed it on slack but I do not see it in this ticket. |
@epoberezkin I am (perhaps surprisingly) not trying to be difficult here. To me, the problem is blatantly obvious. If it is not blatantly obvious to anyone else, that is interesting. If pretty much everyone who comes across this is like "why bother?" then that's all I need to know, really. |
I am also just burnt out. I saw a solution to It's too demoralizing to fight this. |
I've had my fair share of struggles with I don't fully grasp all changes conceptually just by reading this issue in a few minutes, but as I currently understand them in my own words the proposal is to:
I like the removal of shadowing. I don't think many people use it in practice anyway, but having it makes parsing harder. I'm not particularly fond of adding My current problem with
Lastly I don't think any of these proposals will simplify |
@johandorland I've heard the "just I think I'm going to close this and re-open one just to forbid base URI shadowing. I don't even know why we thought that was necessary, TBH. I suspect we had reason to believe someone might be using it, and just wanted to clarify it in examples. We did not add that feature, we were just trying to clarify what we thought was already there based on prior unclear wording at least back to draft-04. @jdesrosiers if you would like to re-file your whole proposal (basically dump #719 (comment) into a new issue), I would highly encourage that. I may or may not re-file the no-fragment+ |
Ahh, I see the confusion now. I was claiming backwards compatibility. You can do everything you used to be able to do and more.
The biggest problem is that I've been using the words "my proposal", but I never intended to propose anything. My intention was to share what I was doing and let you all decide which bits (if any) you want to incorporate into JSON Schema. All of my comments in this issue have been in the spirit of clarifying what my implementation does. I'll create a new issue describing the model my implementation uses and I'll refrain from calling it a "proposal" 😉. @handrews almost entirely understands what I'm doing. There are a few things we disagree on, but theres is also still something I'm not communicating well enough. Everyone who thinks this model is more complicated than what we have now, you're missing something. I encourage you to follow the issue I will be creating shortly. I'm going to take another stab at explaining it better. |
@jdesrosiers thank you
That is true, there are various solutions to that.
You will not solve this by changing $id - $ref cannot be inlined when schemas are recursive |
oh and apparently some tool (not the one in the repo I’m linking- something else mentioned there) generates things like why do people want fragments in (ノ°Д°)ノ︵ ┻━┻ |
Err... Because they can? The most common use case I’ve seen in many schemas is to insert “$id”: “#name” in a definition to then use “$ref”: “#name” instead of “#/definitions/name”. The case you’ve shown is probably to simplify visual navigation in large schemas so you can see where you are, which is problematic otherwise. But the tools could simply put it in $comment” or any other custom keyword - it doesn’t change any addressing, so doesn’t have to be $id. |
Without reading [m]any of the comments, I think this is quite sensible at first glance. Ideally, I would have a property like
It's like having
Any case where you get to attach a name to things, you have to make an index of all the named things. (Until quantum computing becomes a thing, at least.)
iirc it's something I more-or-less invented after surveying draft-4 implementations. (There's a little more to it than that, but I'd have to dig up notes to be sure.) |
Idea from @jdesrosiers on slack (with minor tweaks from me, and probably a misinterpretation or two, but this is at least good enough to record the general concept):
Instead of discussing
$id
as primarily assigning URIs to schema objects, shift the focus to schema documents. For reasons that will be apparent later, also say that$id
URI references MUST NOT contain a fragment.The key idea is that schema documents can be embedded in other schema documents.
$id
is used to indicate an embedded document, and the schema object containing that$id
is considered to be the root schema of the embedded document. Whether it is standalone or embedded, a schema document's base URI is the value of$id
in its root schema.An embedded document's
$id
can be a relative URI reference, in which case it is resolved against the base URI of the containing schema document.The contents of embedded documents cannot be referenced with a JSON Pointer fragment attached to the containing document's base URI:
In this example, the schema that is the value of
"items"
can be referenced ashttps://example.com/inner#/items
. It cannot be referenced ashttps://example.com/outer#/items
, which is a change from the current behavior.The reason for this may be more intuitive when considering this, functionally identical schema:
This is essentially the same schema, but with "inner" included by reference rather than directly embedded. In this example, it is clear that
https://example.com/outer#/items
is meaningless. There is no such location. Embedding the schema does not make that URI meaningful; essentially, JSON Pointer fragment evaluation cannot cross into an embedded document.In this approach,
$id
is always indicating the base URI of a document. As fragments are stripped from base URIs, it does not make sense to allow fragments in$id
when used this way. In fact, RFC 3986 section 6 states:Therefore, the only time fragments make sense in
$id
is in the plain name fragment definition form:{"$id": "#foo"}
.This form does not work when considering that
$id
indicates an embedded document. Because fragments are removed from a URI before it is used as a base, the base URI of such an embedded document would be identical to that of its containing schema document. This is obviously an incorrect usage of URIs, and should not be allowed.While all of the behavior of
$id
as specified in draft-07 is simply a result of applying RFC 3986 rules to the hierarchical schema structure, most users seem to view the fragment definition form and the base URI change form as separate features. Since this form is not compatible with$id
as an embedded document identifier, and many users view it as a different feature anyway, let's drop this form.In its place, the
"$anchor"
keyword defines plain name fragments. Note that its value is simply the plain name, without the#
fragment:{"$anchor": "foo"}
is the equivalent of the former{"$id": "#foo"}
.This also allows for
{"$id": "https://example.com/foo", "$anchor": "bar"}
to replace
{"$id": "https://example.com/foo#bar"}
which, as far as I can tell, is currently a valid use of
$id
, although apparently I put a CREF in draft-07 wondering whether or how it should actually work. But if we split that function off into an$anchor
keyword and outright forbid fragments in$id
, this is no longer a weird corner case. Each keyword functions separately and unambiguously.The text was updated successfully, but these errors were encountered: