- 2024-02-14: Initial draft
Currently, Rust binding code for CometBFT proto files, which makes up most of
the cometbft-proto
crate, is generated with a custom tool found in tools/proto-compiler
, working on a Git checkout of the cometbft repository where the proto files are maintained.
Instances of protobuf well-known types are represented with custom types
defined in the crate, due to historical problems with prost-types
.
Additional trait implementations are derived for some of the generated
message structures, mainly to provide JSON serialization for RPC and possibly
other non-Protobuf uses. This is done with custom attributes added through code
generator options.
While separating the unrelated concerns by e.g. implementing the traits directly on
the domain types in the cometbft
crate would be cleaner architecturally, doing so
on the protoc-generated types is convenient because the protobuf and RPC
implementations in the CometBFT Go codebase reuse the same data structures and therefore,
the serialization schema. The different versions of protobuf specifications
across past CometBFT releases are only fully exposed in cometbft-proto
and some library users might prefer to work at this level and eschew our domain types,
so it seems prudent to maintain version-accurate serialization as a secondary
feature of the protobuf bindings.
Rust projects which reuse protocol types from CometBFT and generate code from
proto files that include tendermint.*
or cometbft.*
proto packages as a
dependency currently use the same approach, with some local arrangements about
mapping the WKTs and partial reuse of the serde implementations.
This results in problems in dependency management that
are easy to overlook, and duplication of types representing the protobuf WKTs.
Switch to buf for generating cometbft-proto
Rust
bindings for the CometBFT protos that are published to the Buf Schema Registry
under the cometbft/cometbft
module.
Attempt to make use of the following plugins:
- protoc-gen-prost to generate the core data structures.
- protoc-gen-prost-serde (or our fork targeting
informalsystems-pbjson
) to generate the bulk of the serde implementations for the same data structures. - protoc-gen-tonic for gRPC services.
If the plugins are found lacking necessary features, try an alternative approach
with buf build --as-file-descriptor-set
and feeding the output totonic-build
. The serde
implementations would need to be custom-derived like with the current approach.
The prost and tonic plugins invoke prost-build
and tonic-build
, respectively,
so the code produced by them should not principally differ from the output of the
current proto-compiler tool.
The prost-serde plugin produces code depending on pbjson
. This is
an implementation detail not affecting the public API of the generated code,
which exposes the regular serde trait implementations. The serialization schema
produced by pbjson-build
is compliant with the JSON mapping for Protobuf,
to which the marshalling used by encoding/json
in Go is mostly equivalent;
notably, though, the JSON serialization currently used in CometBFT is implemented
differently.
Our attempt to use the protoc-gen-prost-serde
plugin will aim to provide serialization
for most of the types generated from proto files; exceptions will be treated with
setting custom attributes to prost-build
via the buf plugin, much like in the
current approach. A current limitation is that the exclude
option of
pbjson-build
, needed to enable such overrides, is not yet supported
by the buf plugin.
Another obstacle is that the pbjson
crate does not currently support no_std
.
This is needed for ibc-proto
bindings if they are to reuse types from cometbft-proto
.
The developers of ibc-rs have already forked pbjson
, so
a short-term fix would be to fork the buf plugin as well to make the generated code
use the forked informalsystem-pbjson
crate (plus, we get to support the exclude
option at our own pace this way).
In the long term, we should try to upstream the no_std
changes to pbjson
and eliminate the forks.
Where the CometBFT proto files make use of the well-known message types from
the google.protobuf
package, equivalent Rust types need to be substituted
in the code generated for cometbft-proto
. Developers of downstream crates
may be interested in converging on the same types to avoid duplication.
The CometBFT protobufs only use the Duration
and Timestamp
WKTs; protobufs
in IBC, Cosmos-SDK, Interchain Security, and NFT transfer, also use Any
.
Three alternative WKT mapping approaches have been used across the protocol stack:
- prost-types. This is the most common mapping used by code generated with
prost-build
. The crate does not provide serde implementations for the WKTs, so any message structs including fields of WKTs either need overrides for their serde implementation or must omit serde altogether. - pbjson-types. The serde implementations provided by this WKT mapping crate is compliant
with the standard protobuf JSON mapping. However, the JSON schema used by
CometBFT does not use the same format for
Duration
, so the crate cannot be substituted for all uses. Note that the decision on whether to map WKTs topbjson-types
is orthogonal to the use of theprotoc-gen-prost-serde
buf plugin for the protocol messages; the build script can opt to use either or both. - Roll-our-own types defined in
cometbft-proto
(the current approach). These types provide serialization impls interoperable with CometBFT RPC out of the box. The custom type mapping creates obstacles for reuse with other crates, which is the entire purpose of the protobuf WKTs. This can be bridged by providingFrom
conversions from and to the workalikes in the other WKT mapping crates.
As long as RPC-compatible serialization is seen as an important feature, the approach 3 appears
to be the least problematic option. Protobuf generation scripts for crates up the Cosmos stack can
choose to map google.protobuf.Timestamp
and google.protobuf.Duration
to these public types
as well.
Proposed
By switching to Buf for generating code in cometbft-rs and downstream, we will use a consistent tool set and share the benefits of its ecosystem, such as lints, plugins and best practices of dependency management.
Going forward, more developers working with protobuf are expected to be familiar with the Buf tool and its associated schema registry.
The changes to representation of the well-known data types will break the established API.
- cometbft-rs#3 (tracking issue with some discussion).