Skip to content
This repository has been archived by the owner on Jun 4, 2024. It is now read-only.

teleport plugins release process #440

Closed
wants to merge 8 commits into from

Conversation

Joerger
Copy link
Contributor

@Joerger Joerger commented Mar 4, 2022

  • Create an RFD to track the teleport-plugins release process and add steps which are currently missed.
  • Add a new update-api-version make target to update the Teleport API dependency
  • Update update-version make target to call update-api-version

@Joerger Joerger requested a review from zmb3 March 4, 2022 01:47
Add update-api-version and release-version make targets.
@Joerger Joerger force-pushed the joerger/teleport-plugins-release-strategy branch from 7c7dff0 to d9c9c74 Compare March 4, 2022 01:50
Makefile Outdated Show resolved Hide resolved
rfd/0002-teleport-plugins-versioning.md Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@Joerger Joerger requested a review from r0mant March 8, 2022 18:13
Copy link
Contributor

@r0mant r0mant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm to me in general but let's get a 2nd pair of eyes from e.g. @zmb3

Makefile Outdated Show resolved Hide resolved
@Joerger Joerger marked this pull request as ready for review March 8, 2022 19:21
Comment on lines +47 to +49
To avoid delaying releases, a PR should be made as soon as `[email protected]`
is released to update the Plugins version to `vX.Y.Z-beta.1`. Any issues can then
be resolved well before `[email protected]` is released, and updated afterwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, a new Plugin's release brings a lot of changes directly into master
Those changes were not tested in the the Plugin's context
So, when we add them, things can (and do) break

As an example, there's a terraform test for the SAML connector
The teleport's validation was changed and now there's a required field
(It's not my goal to discuss the change itself, but the fact that we now require a certain field)
When we upgrade the teleport version used in tests (currently 9.0.4) this test will break

The fact that we had a breaking change means we:

  • need manual work when running the release script
  • need to have context on the failing change, in order to fix it

We can solve this in a more holistic way, by requiring PRs in teleport to have the Plugin's state into consideration - using a monorepo
Continuing the example, the PR above would only be accepted if the author changes the terraform test as well
The downside is that this will add extra burden when developing a new Teleport's feature
As a side effect, we would only have a single release point in time, and backports would be "free"

I'm fine with your suggested approach, because I understand the negative impacts the monorepo approach will have in the long run.
Nonetheless, have we considered merging the teleport-plugins repo into the teleport one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, a new Plugin's release brings a lot of changes directly into master Those changes were not tested in the the Plugin's context So, when we add them, things can (and do) break

On the flipside, we've had things break from not updating the API module, terraform schema, etc. See #431 for example, which is what led me to start this RFD.

The fact that we had a breaking change means we:

  • need manual work when running the release script
  • need to have context on the failing change, in order to fix it

We can solve this in a more holistic way, by requiring PRs in teleport to have the Plugin's state into consideration - using a monorepo Continuing the example, the PR above would only be accepted if the author changes the terraform test as well The downside is that this will add extra burden when developing a new Teleport's feature As a side effect, we would only have a single release point in time, and backports would be "free"

I'm fine with your suggested approach, because I understand the negative impacts the monorepo approach will have in the long run. Nonetheless, have we considered merging the teleport-plugins repo into the teleport one?

I think that expanding the Teleport repo to include more submodules similar to teleport/api could be useful and has not yet been explored. I agree that this would be a much more holistic and effective approach for us at Teleport. However it diminishes part of the idea behind the plugins, which is to make it an "independent" project which serves as an example to customers who want to develop their own plugins. But if we value the teleport-plugins as a product more than the api at large, then this may be a worthwhile approach.

I think another way to approach this issue holistically would be to ask how we can make it easier for third-party developers to use the Teleport API more easily, including ourselves. Adding tests to the Teleport Repo which checks for compatibility with teleport-plugins would solve the problems listed above, but would not solve all of the possible issues for users attempting to create and maintain their own custom Teleport plugin. Instead we could make the API more durable to breaking changes in Teleport, and add tests to verify this durability.

These proposed durability changes could include things like:

Refactoring our gRPC service / protobuf files.

Right now our proto files are hard to import directly and are reliant on an abandoned tool gogo-proto to compile. This leads to difficulties with external tools like our terraform provider which rely directly on the protobuf schema.

We may be able to use Buf to alleviate this issue, though I don't know much about it ( cc @codingllama if you have anything to add, or correct, about Buf). Buf would make it easy for users to directly import our protobuf schema, check for breaking changes when updating, etc. We could also use this to expand the API to other languages and develop an SDK.

Reorganizing the API module into separate modules

Currently we export types, constants, gRPC types, other functions, etc., but the only thing which we protect with our versioning guidelines is the gRPC service itself, and the client methods in teleport/api/client.go. As such, it would make more sense to move the client and gRPC service into their own module(s) (teleport/client), and move the types, constants, defaults, credential logic, etc. in to a separate module (teleport/api or teleport/external?) which the new client module would depend on. This way, when a user upgrades teleport/client they won't have to worry about any changes to types etc that may have occurred (ex: types.RoleV4 -> types.RoleV5).

Deprecation warnings

We could provide the user with descriptive deprecation warnings so that they aren't surprised when something is no longer working. For example, rather than making that sso field a required field without any explanation, we could have done it with a deprecation step for one major version saying Starting in Teleport v11.0.0, this field will be required, please add it in your resource spec.

Improving test coverage

Right now we don't have any automatic tests in place to guarantee that the API follows the version compatibility guidelines. For example, our recent v9.3.4 release actually did not have an httpfallback in place for GetAccessRequests which underwent some changes. We could set up tests to ensure that every API request performs properly with the previous, current, and next major version of Teleport. In this case we could have a test for v8.0.0, v9.0.0, and v10.0.0. Once we have this setup I imagine it'd be very easy to add additional tests for new/updated endpoints. We also just need this for out Teleport automatic tests in general.

Conclusion

In the short term I think we should at least go forward with formalizing this release process, which is essentially the current one which often requires manual work. This will at least put the onus of breaking changes on us at Teleport rather than users of teleport-plugins.

In the long term, I think we should invest more resources into developing an API which is widely usable, easy to upgrade, and extensible to more use cases.

@zmb3 Any thoughts on this?

Copy link

@codingllama codingllama Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me preface this by saying that I don't have experience with this repo, but you pinged me about protos and I can't resist that. :)

Before I get started, in the subject of optional-made-required field, I'm very much in the camp of "never break forward compatibility". I lack the knowledge to suggest an alternative here, but I'm rather certain that there is one.

<begin_proto_refactors>

As for protos, from the top of my head, these are some of the changes I'd like to see:

  1. +1 to no gogo. Google's prod is a monster and their core library teams are excellent, they will catch up and eventually surpass alternatives. I'm very wary of libraries that market themselves as the "better protobuf".
  2. Use a consistent proto import path and organize our files accordingly. "github.com/gravitational/teleport/api/types/types.proto" is not what the import for "package types" should be. Google protos are a good reference here.
  3. Move proto files to a single root, possibly free of .go files. This makes importing more intuitive and allows folks to cleanly vendor/submodule/etc and easily generate clients other than Go.
  4. Better package names. types and protos makes me sad. In this subject, it's often a good idea to version APIs via package name - for example, teleport.auth.v1 is a world better than protos and we have leeway to break API compat (although I think this is very much a "Go 2" situation, as in it's better to never be).
  5. Version protos outside of Go Modules. (See item above.)
  6. No attached Go methods to protos. Protos are transport, validation is the realm of API implementations.
  7. Smaller files
  8. Standard proto formatting, consistent style and a style linter.

There's probably more, but it escapes me now.

Buf does help IMO, we desperately need a consistent style/linter, plus it brings other good properties (such as reproducible builds). In terms of proto pain, though, I think 2. and 3. are the toughest villains of our repo.

Changing import paths is a breaking change, albeit a small one, but if we are willing to take it we could make a much more palatable Buf setup. 4., 5. and 6. will break things. 7. should be transparent (but much easier with 2. and 3., surprisingly difficult without). 8. would be easy if we could split our files without too much trouble, as we could apply stricter requirements to new files (and avoid additions to old ones).

That's it for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding teleport-plugins as source of inspiration for the community
I think it's a valid point and it's worth to keep this separate from the main repo

I do have a concrete suggestion:
Move Terraform Provider into the main repo
Access Plugins and Event Handler remain here.

It seems most of the issues that hit us in the past were mostly with the Terraform Provider (sorry if I'm making a wrong assumption)

When we were evaluating whether the K8S Operator should be here or in the main repo, we concluded that it makes sense for it to be in the main repo
It's too coupled to the Teleport API: when it changes, we must also change
And the same goes for the Terraform Provider, it's a living project that must keep up with all the API changes we do in Teleport

This is not true for the Access Plugins: they can exist for a long time without being affected with Teleport developments

(The event-handler should be quite resilient in that area as well, however I'm not entirely sure about it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right that the major issues we're encountering are only with the Terraform provider, and after looking at the K8s Operator, I do agree that we could/should move the Terraform provider into the main repo. Thanks for taking the time to read my ramblings and return it to a concrete solution 👍

@Joerger Joerger requested review from r0mant and marcoandredinis June 9, 2022 19:58
@Joerger
Copy link
Contributor Author

Joerger commented Jul 15, 2022

Closing since this is stale

@Joerger Joerger closed this Jul 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants