Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code generation #4

Closed
wants to merge 8 commits into from

Conversation

chriskapp
Copy link
Collaborator

A first initial draft of the code generation specification

@handrews
Copy link
Contributor

handrews commented Aug 27, 2017

Thanks for getting this started! [EDIT: Many points in this comment and later ones are now out-of-date due to the draft-07 and draft-08 work having shifted how I think about vocabularies. I've made notes such as this one for key changes.]

While we're still waiting for some other team members to weigh in, let me provide some initial acceptance criteria:

  • Please remove all duplicate information from JSON Schema Core, Validation, and/or Hyper-Schema. Those should be normative references, so you may use terms and keywords from there simply by noting that the keywords are defined in whichever of the three specs is appropriate.
    • We definitely do not want to re-state validation rules, as they will just get out of sync.
    • Keep the focus on restrictions. That is what this spec is adding to the ecosystem.
  • Make sure to motivate your requirements
    • For instance, with Each object type MUST have also a "title" keyword. The title may be used by a generator to determine a class name for a schema., if implementations aren't required to use the title, why are schemas required to provide one? There may be a great answer but it's not obvious to me right now :-)
  • I'm not entirely sure what to do with some of the OpenAPI stuff. Their "deprecated" is likely to clash with ours [EDIT: I now think we should go with their version, and if we want something different, we can use a different name], and I seriously doubt "nullable" will ever fly here.
    • It's probably best to leave "deprecated" out and maybe join the conversation over in the main spec repo before trying to reference it here.
    • If you want to file an issue in this repo for "nullable", feel free. [EDIT: I filed nullable vs type arrays (JSON Schema compatibility) OAI/OpenAPI-Specification#1389 for discussion on the nullable concern. But I think it needs debate before going into even the first pass document. From our perspective, things like "type": ["integer", "null"] cover this use case just fine. If you want to restrict the way multiple type values are used, that will probably go over better.
  • It would be better to file the formats over in the spec repo. I've been collecting "format" issues to try to really get to a good set of formats in draft-07. Then you'll get them here automatically.

I think that's plenty for now. Removing the duplication and emphasizing the differences will make this a lot shorter and more focused.

@handrews
Copy link
Contributor

BTW I am not overly concerned with the language being exactly right or typo-free. Let's just get the initial scope and concepts agreed upon, and we can tidy up the language as we go.

@chriskapp
Copy link
Collaborator Author

Ok, thanks for the feedback! I definitive agree with you that we dont need to redefine the keywords in this spec but the problem which I see is: While this spec restricts JSON Schema keywords it is not only about the restriction. I think it is important to state that this spec follows a different mindset, this means that a user must provide specific keywords i.e. a type and you can not mixup constraints (string/integer) etc. So Iam currently not sure howto easily describes this idea but I will take your input and try to improve the spec. The best way is probably to reference the keyword and then describe the "behaviour" from the point of view of this spec.

Regarding nullable yes the spec does not allow array types i.e. "type": ["integer", "null"] since it is complicated to deal with them especially for static typed languages. Because of that nullable is a great option to allow properties to have i.e. a string or null value. Also the spec does not use the null type. If someone really needs different types for a single property it is also possible to use oneOf.

@handrews
Copy link
Contributor

@k42b3 I think you can do all of that without re-stating the entire validation behavior.

Within the restricted set, all validation keywords MUST behave the same as they do in the validation vocabulary. Otherwise it's just not a valid JSON Schema.

I know that OpenAPI made a number of decisions that they felt were the easiest for them, but in order to make a real JSON Schema vocabulary for this, it needs to work with what already exists.

For instance {"type": ["number", "null"]} is equivalent to OpenAPI's nullable, and can be supported while still forbidding things like {"type": ["number", "string"]}

@handrews
Copy link
Contributor

handrews commented Aug 29, 2017

@k42b3 Thanks for reworking the validation keywords. This looks like a great starting point for that aspect- I have some other questions and feedback, but they are better handled as issues filed after we make a decision on this initial pass.

For nullable and description [EDIT: I meant discriminator], I would like to manage them separately as issues rather than include them in the initial document. Both overlap and/or conflict with either current JSON Schema behavior, or features being added for draft-07 (which will hopefully be out by the end of October).

Managing these through issues will let us have a better discussion of where they live (it might be better to modify or extend the validation vocabulary) and how they fit with any other work to align with OpenAPI concepts and/or syntax.

@philsturgeon this may be relevant to your interests.

@chriskapp
Copy link
Collaborator Author

@handrews I could also remove the nullable and deprecated (I think you mean deprecated instead of description) keywords in this initial draft. In general I think this spec and the validation spec are separated so I would be in favor that each spec can define separate keywords (the validation spec defines keywords with the goal to validate json data and this spec with the goal to model data). This spec would then define nullable and deprecated. But of course I can understand that we dont want to have duplicate keywords. If we merge this branch I could create an issues at this repo regarding the nullable and deprecated keyword. Or do you want to handle those at the json-schema-spec repo?

@handrews
Copy link
Contributor

Well I definitely did not mean description WTF :-P
I was actually thinking of discriminator, but actually all three (nullable, discriminator, and deprecated) should be taken out of the draft. There is already an issue for deprecated in the spec repo (Hyper-Schema needs it), so no need to re-file that.

But do please file nullable and discriminator here. We can start the discussion, and move them over if needed.

@handrews
Copy link
Contributor

handrews commented Sep 6, 2017

@k42b3 thanks for the update! There's a lot of activity on the draft-07 spec right now that will probably keep us busy for the next week, but we should be able to give this more attention once the scope is set and we move into the final review phase.

@chriskapp
Copy link
Collaborator Author

@handrews no problem, this gives me some more time to refine the spec.

@chriskapp
Copy link
Collaborator Author

@handrews, I came to the conclusion that we probably need a new keyword to simplify extending an existing object schema. So basically we need a keyword to merge existing schemas together, please take a look at the following spec:

{
	"definitions": {
		"Person": {
			"type": "object",
			"properties": {
				"name": {
					"type": "string"
				}
			}
			"required": ["name"]
		},
		"Student": {
			"$extends": {
				"$ref": "#/definitions/Person"
			},
			"type": "object",
			"properties": {
				"room": {
					"type": "string"
				}
			},
			"required": ["room"]
		}
	}
}

this would result internally in the following schema:

{
	"Student": {
		"type": "object",
		"properties": {
			"name": {
				"type": "string"
			}
			"room": {
				"type": "string"
			}
		},
		"required": ["name", "room"]
	}
}

Sure we would need some rules how the keywords are merged but I think this would heavily simplify the generation of hierarchical schemas. Also this approach would work with "additionalProperties": false since the processor would simply merge those schemas together which is not possible with allOf. So the schemas are not interpreted independently but as one.

My view is biased to the code generation use case but Iam sure this is maybe also useful for the main spec. In the main repo I saw some issues regarding $merge and $extends but Iam not sure whether this is the same functionality. Is there currently an issue regarding such a keyword?

@handrews
Copy link
Contributor

handrews commented Sep 17, 2017

@k42b3

I came to the conclusion that we probably need a new keyword to simplify extending an existing object schema.

This is the single most contentious topic in the entire history of the JSON Schema project, and is the reason the project collapsed after draft-04 and remained stagnant for years. It is the entire focus of draft-08, once draft-07 is out the door next month. There have been, many, many, many, many (x9000) proposals on this, none of which have ever gotten clear traction. It is a MUCH harder problem than you nearly everyone thinks it is (although it may be solvable in the code generation context, more on that further down). [EDIT: The draft-08 issue filed to consolidate all previous issues for this topic has exceeded 220 comments and is not yet resolved, to give an idea of the level of controversy. Although it's getting much closer.]

I'm not trying to be a jerk here, I just want to set very clear expectations: This is not something that will be decided easily or quickly, and it absolutely cannot be part of the initial spec. It is, as best I can tell, the primary reason that almost everyone involved with draft-04 and earlier is gone.

Here's a link to the most recent batch of proposals: https://github.com/json-schema-org/json-schema-spec/issues?utf8=%E2%9C%93&q=is%3Aissue%20label%3A%22re-use%20%2F%20addlProps%22%20 [EDIT: I don't recommend reading any of these, and I removed some more ranting related to these now-outdated issues from this comment.]

However, I mentioned that this may be solvable for code generation. One of the reasons to split code/ui/doc generation out of validation is that the theoretical underpinnings are not compatible. Validation is a constraint system: you add more constraints as you add keywords or combine schemas. It is not possible to remove constraints. In particular, for understandable reasons, data definition users want to set "additionalProperties": false and then later add more properties.

This fundamentally does not work.

What I think might work is that, after restricting the set of validation functionality available in the code generation vocabulary, we may be able to come up with something that works within that restricted set, even though it would not work in the full validation context. That is what I am hoping will happen.

FWIW, I think your proposal is probably closest to the $spread proposal.

@chriskapp
Copy link
Collaborator Author

@handrews thanks for clearing this up, I was not aware that this was/is such an controversial topic.

I could imagine that these proposals are coming from devs with an OOP background. Especially those working with JsonSchema inside OpenAPI. I found this OpenAPI issue which basically proposes the same functionality.

I think the point that JSON Schema is not an object-oriented system is probably the reason for this confusion since many devs want to use it that way. So I think the best solution is like you described to provide a solution within the code generation spec. Maybe this could help to bridge those two worlds.

If this branch gets merged I plan to create an issure regarding the nullable and discriminator keyword. Maybe I will also add an issue regarding this. But I will also try to go through the proposals you have linked, but this may take some time ;)

@handrews
Copy link
Contributor

@k42b3 once the draft-07 work is mostly done I'm going to consolidate all of those into two or three summary issues. I wouldn't bother with them unless you're really into reading people arguing endlessly on the internet. And even then, there are more entertaining examples out there :-P

@handrews
Copy link
Contributor

handrews commented Sep 18, 2017

[EDIT: Sometimes I'm petty and annoying, like in this comment :-) I'm leaving it up as a reminder to think more before complaining about other people's work.] I admit, I still don't understand anyone's obsession with nullable, though. OpenAPI included. It's a solved problem in JSON Schema, adding a 2nd way to solve it is guaranteed dead on arrival.

@chriskapp
Copy link
Collaborator Author

I think the problem is here that the solution requires the usage of the array type i.e. "type": ["string", "null"] and allowing multiple types for a single property is only possible in dynamically typed languages like i.e. javascript but in languages like java it is difficult to represent.

@handrews
Copy link
Contributor

I think the problem is here that the solution requires the usage of the array type i.e. "type": ["string", "null"] and allowing multiple types for a single property is only possible in dynamically typed languages like i.e. javascript but in languages like java it is difficult to represent.

I don't really follow. The idea would be that

{
  "type": ["string", "null"]
}

would be handled exactly like

{
  "type": "string",
  "nullable": true
}

is now. They do exactly the same thing, except the first is valid JSON Schema.

The restricted vocabulary can restrict the use of the array form of type by saying something like: "MUST be a string or a two-element array. If an array, the second element MUST be "null"

@handrews
Copy link
Contributor

@k42b3 while it may seem like we've forgotten about this PR, it's actually been informing a lot of my thought around where to go with draft-08 (and probably draft-09, as I doubt we'll get it all done at once). It also affected some things in draft-07.

draft-07 changes

  • readOnly and writeOnly are now in the Validation spec's annotation (formerly metadata) section (deprecated got bogged down, but we'll revisit it soon)
  • Keywords are classified into Applicators, Assertions, and/or Annotations (the categories are not mutually exclusive). Some of this terminology also made it into the core spec.
  • Annotation keywords now have well-defined behavior for combining multiple values, and we can define additional behaviors as needed

draft-08 discussions

As noted in my last big comment, we are looking at re-usability / extensibility. I consolidated the old issues into a new one- that already went a bit off the rails due to the extremely divisive nature of the topic, so we're taking a step back and regrouping.

However, I have referred to draft-08 as a room that we're all locked in until that gets decided. I have emailed the members of the OAI technical steering council who have notably commented on JSON Schema (here or in the OAI repo) to get their input if they are interested in providing it.

Issues in the OAI repository

  • I've filed nullable vs type arrays (JSON Schema compatibility) OAI/OpenAPI-Specification#1389 regarding "type": ["...", "null"] vs nullable, and it has so far attracted positive comments. No word from the TSC yet.

  • I need to spend more time with discriminator to understand how it fits with draft-07's if/then/else

  • I'm working through writing an OAI 3.0 document for an existing API, which will help me understand the task here a bit better. That will probably produce more issues in one repo or the other.

@chriskapp
Copy link
Collaborator Author

@handrews sounds great!

I think the ["(type)", "null"] type is a good idea since this restrict invalid types i.e. ["string", "integer"]. Of course this array notation could lead a user to misuse the array but I think it is a good compromise. So let us wait and see what the OAI guys are thinking about this.

So if the nullable case gets covered by the core spec we only need to handle the deprecated and discriminator keyword. As you said the deprecated keyword could be simply defined as annotation keyword. Regarding the discriminator, it is basically needed to pick the correct type for an oneOf array. I.e. taken from the OAI spec:

MyResponseType:
  oneOf:
  - $ref: '#/components/schemas/Cat'
  - $ref: '#/components/schemas/Dog'
  - $ref: '#/components/schemas/Lizard'
  discriminator:
    propertyName: pet_type

if the pet_type contains Cat we use the Cat schema. Through this way we can clearly select a specific schema even if the properties are not distinct.

I have just re-read the proposal and I still think that it could be very useful for some users. So please let me know if you want to start digging into more details i.e. improve the formulations etc.

And by the way thanks for all the effort that you put into the JsonSchema spec.

@handrews
Copy link
Contributor

@k42b3 my difficulty with discriminator is that it goes way outside the current processing model of JSON Schema, in particular by making the names of schemas under components significant.

In JSON Schema, this is how that example would be done:

MyResponseType:
  anyOf:
  - if:
      properties:
        pet_type: {const: cat}
    then:
      $ref: '#/components/schemas/Cat'
  - if:
      properties:
        pet_type: {const: dog}
    then:
      $ref: '#/components/schemas/Dog'
  - if:
      properties:
        pet_type: {const: lizard}
    then:
      $ref: '#/components/schemas/Lizard'

While it's a bit more verbose in this simple case, it is also much more flexible. Despite the anyOf, the conditionals ensure that at most one option will be matched. More importantly, it does not start interpreting arbitrary JSON keys as value matchers.

@handrews
Copy link
Contributor

Basically, I'd like to give if/then/else some time in the wild to see if it will be suitable on its own.

@handrews
Copy link
Contributor

handrews commented Oct 9, 2019

@chriskapp well, here we are nearly two years after my last reply, and more than two years after you filed this.

"draft-08" (ultimately named 2019-09) has been published, with a new vocabulary concept included. As seen in Appendix E of the Core spec, the thought on how to achieve use cases such as code gen has moved towards annotation keywords to disambiguate things rather than restricted keyword semantics.

For that reason I'm going to close this. It was excellent work that drove a lot of thought on the developmen of draft-08 (both in terms of what problems we needed to solve, and in terms of what difficulties existed in the ideas that were prevalent when you wrote this). But I think it is now best for everyone to take a step back, digest the new draft, and start afresh.

Thanks again for the work you put in on this, and my apologies for the extremely long timeline here. Please let me know if you have any concerns about me closing this PR.

@handrews handrews closed this Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants