Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Vtctld Service #7058

Open
ajm188 opened this issue Nov 19, 2020 · 6 comments
Open

RFC: Vtctld Service #7058

ajm188 opened this issue Nov 19, 2020 · 6 comments
Assignees

Comments

@ajm188
Copy link
Contributor

ajm188 commented Nov 19, 2020

(I'm working on a sharable google doc form, for folks that would prefer to comment inline)

Abstract

This document presents a design to define a well-typed gRPC interface to the vtctld component of Vitess.

Background

The current vtctl API is a gRPC interface defining a single method:

service Vtctl {
    rpc ExecuteVtctlCommand(ExecuteVtctlCommandRequest) returns (stream ExecuteVtctlCommandResponse) {};
}

message ExecuteVtctlCommandRequest {
    repeated string args = 1;
    int64 action_timeout = 2;
}

message ExecuteVtctlCommandResponse {
    Event event = 1;
}

// Extracted from logutil.proto
message Event {
    // other fields omitted
    string value = 5;
}

Note: this is a modified example to fit in one snippet.

This interface gave us the flexibility to add, modify, and remove commands quickly, without needing to worry about adherence to a strict protobuf API definition.

It now poses several problems.

First, from the client perspective, this is effectively an untyped interface. Unlike other gRPC interfaces, the first element of ExecuteVtctlCommandRequest.args determines the underlying method call and the shape of the response data. We lose the benefit of protobufs defining the structure on both sides of the network boundary, and must parse arbitrary, sometimes-but-not-always JSON-formatted bytes into data structures before we can do anything with them. The list of available methods and their arguments, as well as the shape of their responses may change at any time, without any feedback or warning from the Go compiler. Further, some of the current vtctl methods use custom MarshalJSON implementations, without providing the inverse UnmarshalJSON, forcing the client to define nearly-identical types to provide this behavior. These types must be kept in sync by the client, causing additional maintenance burden.

Second, on 07/24/2020, VEP3: Vitess Upgrade Order removed the upgrading ordering requirements, beginning with version 8.0. VEP3 specified a new requirement to facilitate this change: “a new version of a server must work with an old version as well as the current version.” Due to other problems mentioned in the above paragraph, this is difficult, if not impossible, to guarantee at compile time for the vtctl API.

Goals

  1. Provide a well-typed gRPC API for writing client programs against a vtctld.
  2. Define a path to deprecate the former API.

Non-Goals

  1. Modifications to the existing API, or to its consumers.

Proposal

We propose to add a new service to the existing vtctldservice.proto file, called Vtctld that will live alongside the Vtctl service.

Design

The service will provide one RPC definition for each supported vtctl command (GetKeyspaces, GetVSchema, Backup, VReplicationExec etc). Any additional message definitions will be defined in vtctldata.proto, unless a more appropriate proto file is identified.

When adding an RPC to VtctldServer, the corresponding legacy implementation should be updated to invoke the new implementation under the hood, and then invoke the usual printJSON(wr.Logger(), resp). For a proof-of-concept of this, see branch am_vtctld_proto.

A few things to point out in that example:

  • A VtctldServer implementation does not take a Wrangler, instead it has its own TopoServer and TabletManagerClient fields. We discuss this further in Other Considerations.
  • A Wrangler will eventually include a *VtctldServer to reduce allocations.

package vtctlclient

Package vtctlclient uses a factory pattern to return an implementation of VtctlClient. Clearly, this does not work with the new interface defined by the Vtctld protobuf.

Instead, we introduce a new interface defined as:

type VtctldClient interface {
    vtctlservicepb.VtctldClient
    Close() error
}

From there, we add a new factory and constructor:

var vtctldClientProtocol = flag.String("vtctld_client_protocol", "grpc", "the protocol to use to talk to the vtctld server")

type VtctldFactory func(addr string) (VtctldClient, error)

var vtctldFactories = make(map[string]VtctldFactory)

func RegisterVtctldFactory(name string, factory VtctldFactory) {
        if _, ok := factories[name]; ok {
                log.Fatalf("RegisterVtctldFactory: %s already exists", name)
        }
        factories[name] = factory
}

func NewVtctldClient(addr string) (VtctldClient, error) {
        factory, ok := factories[*vtctldClientProtocol]
        if !ok {
                return nil, fmt.Errorf("unknown vtctld client protocol: %v", *vtctldClientProtocol)
        }
        return factory(addr)
}

Identical with the current layout, fakevtctlclient and grpcvtctlclient will provide implementations and register factories for those implementations. vtctlclienttest will adapt the current test suite to the new interface.

Compatibility Considerations

  • The current vtctl server is used by the command line and (next bullet point, indirectly).
  • The current vtctl client is used by vtworker (legacy), “automation”, and vttest.

We will need to update vtworker, automation, and vttest to use the new vtctldclient. An outstanding question for me is if we can ignore vtworker, or if that needs to be updated as well.

Timeline and Rollout

A codebase with two competing ways of accomplishing a task can be frustrating to work in at best, and risky to work in at worst. We want to minimize the amount of time in this state. To accomplish this, we will declare the VtctldServer interface to be unstable until it has met parity with the VtctlServer. We can consider the two at parity when all legacy vtctl commands defer their logic to the corresponding VtctldServer methods.

What “unstable” means in this context may be unorthodox, so:

  • Direct external clients of VtctldServer do so at their own risk; we reserve the ability to change the shape of the API at any point during the unstable period.
  • When changing the shape of the VtctldServer API, we will perform the necessary changes to ensure indirect clients do not break.

The distinction here is that an engineer writing an arbitrary external program against the VtctldServer should not trust the stability of the interface during the transition, but when changing any particular VtctldServer method, we will update any vtctl commands that call that method to ensure correct behavior from a blackbox perspective.

We aim for this period to take less than two months.

After reaching parity, the VtctldServer API will be declared stable and become subject to all the standard release cadence and compatibility standards of the Vitess project. At the same time, we will declare the VtctlServer API to be deprecated.

Other Considerations

  1. We choose to make VtctldServer take its own TopoServer and TabletManagerClient for interacting with the topo, and to have Wrangler take a VtctldServer to have calls flow in that direction. Alternatively, we could force VtctldServer use a Wrangler to access the topo. This would look like:

    type VtctldServer struct {
        wr *wrangler.Wrangler
    }
    
    func (vtctld *VtctldServer) GetKeyspace(ctx context.Context, req *vtctldatapb.GetKeyspaceRequest) (*vtctldatapb.Keyspace, error) {
        keyspace, err := s.wr.TopoServer().GetKeyspace(ctx, req.Keyspace)
        if err != nil {
            return nil, err
        }
    
        return &vtctldatapb.Keyspace{
            Name: req.Keyspace,
            Keyspace: keyspace.Keyspace,
        }, nil
    }

    I have no strong opinion. At first glance, giving VtctldServer direct access to the topo server feels simpler than forcing everything through the wrangler, but this is a very minor point with respect to the overall proposal. If going through wrangler is indeed better, then we will switch to that.

  2. We chose to create an entire second interface rather than progressively add typed methods to the existing interface. We spent an afternoon exploring this alternative and found that the number of types and mocks that would need updating each time a method was added wasn’t worth the trouble of keeping the vtctl name. The upside here is getting to move faster; the downside here is that we have to pick a new name, hence vtctld.

@ajm188 ajm188 added the Type: RFC Request For Comment label Nov 19, 2020
@derekperkins
Copy link
Member

I'm very excited for this. We've avoided some automation simply because the old "api" was so fickle to deal with. It has also held back innovation on the Vitess dashboard & UI.

@deepthi
Copy link
Member

deepthi commented Dec 3, 2020

This is very detailed and thorough.
What happens to the existing vtctlclient binary? Do we deprecate that along with VtctlServer?

@ajm188
Copy link
Contributor Author

ajm188 commented Dec 3, 2020

Yeah, we definitely want to deprecate that. Correct me if I'm wrong, but I think to maintain compatibility guarantees the ordering would have to be:

  • Release 1: deprecate client
  • Release 2: remove client, deprecate server
  • Release 3: remove server

@deepthi
Copy link
Member

deepthi commented Dec 4, 2020

VEP-3 specifically deals with components of a running vitess system. vtctlclient is not exactly part of a running cluster. We would be ok to think of it as being part of vtctld and deprecating it along with the server. If anyone objects we can re-consider.

@rohit-nayak-ps
Copy link
Contributor

Design and implementation plan look great. I also really like the fact that this uses cobra: allows us to use bash completion, auto-suggest and sub-commands etc etc. I had tried to retrofit cobra but we decided it wasn't worth the effort at that time. Also a useful effect is that a lot of deprecated and obsolete commands will get dropped.

@deepthi deepthi assigned ajm188 and unassigned deepthi Dec 4, 2020
@ajm188
Copy link
Contributor Author

ajm188 commented Dec 5, 2020

I had tried to retrofit cobra but we decided it wasn't worth the effort at that time.

Yeah, revisiting flags more holistically is uhhhh another thing I would be interested in doing at some point 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants