Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: OpenAPI Backend #16042

Closed
jyggen opened this issue Jul 3, 2022 · 10 comments
Closed

RFC: OpenAPI Backend #16042

jyggen opened this issue Jul 3, 2022 · 10 comments
Assignees

Comments

@jyggen
Copy link
Member

jyggen commented Jul 3, 2022

OpenAPI Backend

I've been thinking about adding a backend for OpenAPI definitions to Pants, but before I get too carried away with coding it I wanted to "throw it out there" and gather some feedback and do some bike-shedding regarding target names and so on.

The main beneficiary of this would be people who take a design first approach when building their APIs, people doing code first usually don't have their generated documents in-repo anyway - but if they do they would probably find some use for this as well. The idea was born while I was tinkering with a Spectral plugin at work and realized that most of the code needed wasn't specific for Spectral but for OpenAPI files, and thus could be used to much easier integrate other tools from the rather big OpenAPI ecosystem.

So, without further ado, here are my thoughts and ideas how the OpenAPI backend would look. I've got the basics done already, so expect at least a draft MR in the near future.

Targets

The plan is to introduce two new targets; openapi_definition and openapi_source. openapi_definition will be used to specify the main OpenAPI document and openapi_source for any JSON/YAML file referenced in the openapi_definition (or another openapi_source) using $ref and the main OpenAPI document itself.

The reason why two targets are introduced is because most (if not all) tooling around OpenAPI expect one file as input (the main OpenAPI document) and will do reference resolving themselves and include other files as needed - so having a clear openapi_definition makes it easier for goals to work as expected out of the box instead of having to add skip_foobar_lint to all targets that aren't the main file.

❓ Is openapi_source a good name for "non-spec" files? openapi_library was my first option, since the file is basically a library of schemas you reference in your API definition, but openapi_source felt more in line with the rest of Pants.
openapi_document instead of openapi_definition? Naming things is hard.

Dependency Inference

Dependency inference between openapi_definition and openapi_source targets should be achievable by simply resolving at all $ref keys that have relative file paths.

$ref can also be external URLs or absolute file paths, but I'm not sure if we can/want to resolve these in a good way? It does lead to cases though were these files could change and Pants won't notice. I have a hunch that absolute file paths are quite rare though.

Goals

tailor

Support for tailor should be relatively straight forward by finding openapi.json or openapi.yaml files and add them as a openapi_definition. Once dependency inference is working, the same $ref resolver can also be used to add openapi_source automatically based on the openapi_definition's dependencies (and recursively add even more openapi_source based on dependencies' dependencies).

lint

lint will validate the OpenAPI definition and make sure it's valid, most likely using OpenAPI Spec validator. I also plan to add support for Spectral in a separate backend.

❓ The plan is to ship the basic validator as part of the main OpenAPI backend. Any arguments for having it as its own?

check

Not sure if it goes under check or lint, but currently at @snowfall-travel we use OpenAPI Diff outside of Pants to detect breaking changes in our OpenAPI definitions in CI. This would be a nice addition as well that I might tackle after everything else is done (as a separate backend).

fmt

OpenAPI documents are simple JSON and YAML files, so one way to do formatting would be to load the file with json or pyyaml (these will be required to parse the files for dependency inference anyway) and then dump the contents back using whatever "pretty print" way is available in the libraries.

❓ Will this actually be useful? It would for me!
❓ If it's useful, the same question as the one lint applies, separate backend or not?

package

Not something I've planned though since the idea hit me just now while writing this RFC, but maybe! The tooling in the OpenAPI ecosystem is not always great at handling specifications split over multiple files, which is why there are tools around to "dereference" an OpenAPI document (aka resolving all $ref and inlining them instead). So maybe package could do that at some point? Because while it's usually easier and "DRYer" to split the specification into multiple files, it's also a lot nicer to be able to share one file with people rather than a whole bunch of them.

export-codegen

While I don't plan on adding it personally (at least not in the foreseeable future), something like OpenAPI Generator could be used to generate client and server code based on the OpenAPI document and then hook it into other Pants stuff (similar to the protobuf codegen).

@jyggen jyggen self-assigned this Jul 3, 2022
@alonsodomin
Copy link
Contributor

I very much welcome an OpenAPI backend. So much that I myself did a PoC on this topic although I was a bit more focused around the code generation (and in fact using the same tool you mentioned)

Regarding the target types, i kind of find your openapi_source target a bit redundant. Mostly saying this because it would basically be a glorified file (or resource) target. In the end, the OpenAPI spec doesn't really care what kind of schema your supporting files follow as long as the external $refs in the OpenAPI document point to a valid path inside the given file.

So I would consider changing the openapi_source target to just be a file or resource and then renaming the openapi_definition to openapi_sources for consistency with other targets and backends. Not sure if I'm oversimplifying it.

Your tailor implementation based on convention for the filename sounds reasonable to me.

@jyggen
Copy link
Member Author

jyggen commented Jul 20, 2022

Thanks for the feedback! That's fair. I did think about just using resource and file but decided against it for whatever reason - but since they're just JSON and YAML files we should get away with only having a openapi_source for the main file.

@alonsodomin
Copy link
Contributor

alonsodomin commented Jul 20, 2022 via email

@jyggen
Copy link
Member Author

jyggen commented Jul 26, 2022

The tricky part with using file and resource is dependency inference. The first level is fine, aka. when an openapi_source depends on a file or resource, but as soon as that file or resource depends on another file or resource we're suddenly dealing with OpenAPI specific dependency inference on a generic target type. Not sure how to handle that.

@alonsodomin
Copy link
Contributor

Not fully sure what is meant by OpenAPI specific in the context of dependencies... wouldn't obtaining the TransitiveTargets be enough to traverse the dependency graph?

@jyggen
Copy link
Member Author

jyggen commented Jul 27, 2022

It could very well be me not fully understanding dependency inference just yet, but here's how I see it:

Let's say we have three files: a.json, b.json and c.json with the dependency chain a -> b -> c (through $refs).

  • a.json is our openapi_source and we can easily find its dependency to b.json through its $ref with the dependency inference logic for openapi_source.
  • b.json and c.json are both file, so in order for b -> c to be inferred we need to either:
    • Add dependency inference for file and resource for JSON and YAML files (using the same $ref resolve as openapi_source).
      • This is what I called "OpenAPI specific" (though it would also work for JSONSchema etc.) since that dependency inference won't work on all file and resource.
    • Have the dependency inference for openapi_source go deeper so a.json ends up with direct dependencies on both b.json and c.json.
      • This will make the dependency graph wrong, causing ./pants dependencies and ./pants dependees to be unreliable (and, more importantly, --changed-dependees=transitive).
    • Some method I'm unaware of that allows Pants to build the entire chain from openapi_source alone.

@alonsodomin
Copy link
Contributor

I think I get it now, using files or resources the end user will always have to explicitly provide the dependencies for the b -> c part of the graph.

I was tempted to mention that wouldn't be such a common case but using those targets either forces the users to glob all potential OpenAPI sources into the same files target or, as said before, to be explicit about them. And none of them sound nice.

So yeah, it seems that to have an implementation of dependency inference that feels complete, it seems that two targets are what is needed. Sorry for making you waste time exploring the possibility of using more generic target types.

@cognifloyd
Copy link
Member

This sounds like it would support connexion projects nicely.
https://github.com/spec-first/connexion

Lint with openapi spec validator to begin with sounds good.

And if pants could ease the pain of getting OpenAPI generator setup, then I might be more likely to use it (I avoid java as a rule).

I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec. https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#definitionsObject

I think I would use openapi_document or openapi_doc for the primary file. This plays off the how that is defined in the spec:
https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md#openapi-document
And the 2.0 spec seems amenable to the term "document" as well.
https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#swagger-object

Another possibility would be openapi_spec since they are often referred to as specs.

For the supporting files, openapi_source makes sense.

@jyggen
Copy link
Member Author

jyggen commented Jul 27, 2022

Sorry for making you waste time exploring the possibility of using more generic target types.

No worries! I was a bit conflicted about it already, so it was worth giving it another thought :)

I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec.

Thanks for the feedback! Yeah, I agree that openapi_document is probably better than openapi_definition. I'll update the PR.

stuhood pushed a commit that referenced this issue Sep 8, 2022
This PR aims to add an initial backend for OpenAPI documents as outlined in #16042, most thoughts behind the PR is there but here's a quick rundown:
- `openapi_definition` and `openapi_source` targets (and `openapi_definitions` and `openapi_sources`)
  - `openapi_definition` is for the main OpenAPI document and `openapi_source` is also for that file as well as for any JSON/YAML file referenced by it (and files referenced by those files and so on)
  - this was done since most tooling only cares about the main document and will resolve references themselves, and then it's easier for rules to simply target `openapi_definition` instead of having to go through all `openapi_source` and guess the entry points
- dependency inference for `openapi_definition` is simply checking if there is a `openapi_source` target on the same file
- dependency inference for `openapi_source` is done by following all `$ref` in the file that points to local files
- tailor will first find all `openapi.json` or `openapi.yaml` files not owned and create `openapi_definitions` in those dirs, then do dependency inference on those files and create `openapi_sources` for unowned files (as well as for the `openapi.json` or `openapi.yaml` files)

There's probably some tweaking in naming and behaviour that needs to be done based on discussions (either here or in #16042). The difficult part is that OpenAPI documents are simply JSON and YAML files so, unlike most other backends, you can't assume that all files with a certain extension is relevant, which is especially tricky with `tailor`.

[ci skip-rust]
[ci skip-build-wheels]
@stuhood
Copy link
Member

stuhood commented Oct 12, 2022

Thanks a lot @jyggen!

@stuhood stuhood closed this as completed Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants