Skip to content

Commit

Permalink
changes based on feedback
Browse files Browse the repository at this point in the history
* Add detailed signature procedure for PUT /v1/providers
* Add error codes
* Remove pagination and limit field
* Many other small changes based on feedback
  • Loading branch information
guseggert committed Oct 19, 2022
1 parent 1d9ec9c commit a94bfb2
Show file tree
Hide file tree
Showing 2 changed files with 89 additions and 48 deletions.
42 changes: 29 additions & 13 deletions IPIP/0000-delegated-routing-http-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,60 +20,76 @@ but it has resulted in a very unidiomatic HTTP API which is difficult to impleme
The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated routing HTTP API.

Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP,
and this spec could then be rewritten in the IDL, maintaining backwards compatibility.
and implementations of this spec could then be rewritten in the IDL, maintaining backwards compatibility.

## Detailed design

See the [API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP.
See the [Delegated Routing HTTP API design](../routing/DELEGATED_ROUTING_HTTP.md) included with this IPIP.

## Design rationale
To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about:

- Reframe methods are encoded inside messages
- This prevents URL-based pattern matching on methods
- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) are encoded inside messages
- This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations:
- Configuring different caching strategies for different methods
- Configuring reverse proxies on a per-method basis
- Routing methods to specific backends
- Method-specific reverse proxy config such as timeouts
- Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter
- This was initially done by URL-escaping the raw bytes
- Not possible to consume correctly using standard JavaScript
- Not possible to consume correctly using standard JavaScript (see [edelweiss#61](https://github.com/ipld/edelweiss/issues/61))
- Shipped in Kubo 0.16
- Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective
- Added complexity of "Cacheable" methods supporting both POSTs and GETs
- The required streaming support and message groups add a lot of implementation complexity but isn’t very useful
- The required streaming support and message groups add a lot of implementation complexity, but streaming does not work for cachable methods sent over HTTP
- Ex for FindProviders, the response is buffered anyway for ETag calculation
- There are no limits on response sizes nor ways to impose limits and paginate
- This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later
- The Identify method is not implemented because it is not currently useful
- This is because Reframe's ambition is to be generic catch-all bag of methods across protocols, while delegated routing use case only requires a subset of its methods.
- Client and server implementations are difficult to write correctly, because of the non-standard wire formats and conventions
- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51), and is currently maintained by IPFS Stewards who are already over-committed with other priorities
- Example: [bug reported by implementer](https://github.com/ipld/edelweiss/issues/62), and [another one](https://github.com/ipld/edelweiss/issues/61)
- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51-L100), and is currently maintained by IPFS Stewards who are already over-committed with other priorities
- Only the HTTP transport has been designed and implemented, so it's unclear if the existing design will work for other transports, and what their use cases and requirements are
- This means Reframe can't be trusted to be transport-agnostic until there is at least second transport implemented (e.g. as a reframe-over-libp2p protocol).

So this API proposal makes the following changes:

- The API is defined in HTTP directly
- "Methods" and cache-relevant parameters are pushed into the URL path
- Streaming support is removed, and optional pagination is added, which limits the response size and provides a scalable mechanism for iterating over arbitrarily-large collections
- The Delegated Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts
- "Method names" and cache-relevant parameters are pushed into the URL path
- Streaming support is removed, and default response size limits are added along with an optional `limit` parameter for clients to specify response sizes
- We might add streaming support w/ chunked-encoded responses in the future, but it's currently not an important feature for the use cases that an HTTP API will be used for
- Pagination could be added to this in the future, if needed
- Bodies are encoded using standard JSON or CBOR, instead of using IPLD codecs
- JSON uses human-friendly string encodings of common data types
- CIDs are encoded as CIDv1 strings with a multibase prefix (e.g. base32), for consistency with CLIs, browsers, and [gateway URLs](https://docs.ipfs.io/how-to/address-ipfs-on-web/)
- Multiaddrs use the [human-readable format](https://github.com/multiformats/multiaddr#specification) that is used in existing tools and Kubo CLI commands such as `ipfs id` or `ipfs swarm peers`
- Byte array values, such as signatures, are multibase-encoded strings (with an `m` prefix indicating Base64)
- The "Identify" method and "message groups" are removed

### User benefit

The cost of building and operating content routing services will be much lower, as developers will be able to reuse existing industry-standard tooling.
They no longer need to learn Reframe-specific concepts to consume or expose the API.
This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS netowrk
and increasing data availability.

### Compatibility

#### Backwards Compatibility
IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version.
Because the existing Reframe spec can't be safely used in JavaScript, the experimental support for Reframe in Kubo will be removed in the next release,
and delegated routing will subsequently use this HTTP API. We may decide to re-add Reframe support in the future once these issues have been resolved.
Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it,
the experimental support for Reframe in Kubo will be removed in the next release and delegated routing will subsequently use this HTTP API.
We may decide to re-add Reframe support in the future once these issues have been resolved.

#### Forwards Compatibility
Standard HTTP mechanisms for forward compatibility are used--the API is versioned using a version number in the path. The `Accept` and `Content-Type` headers are used for content type negotiation. new methods will result in new paths, and parameters can be added using either new query parameters or new fields in the request/response body. Certain parts of bodies are labeled as "opaque bytes", which are passed through by the implementation, with no schema enforcement.
Standard HTTP mechanisms for forward compatibility are used:
- The API is versioned using a version number in the path
- The `Accept` and `Content-Type` headers are used for content type negotiation
- New methods will result in new paths
- Parameters can be added using either new query parameters or new fields in the request/response body.

Certain parts of bodies are labeled as "{ ... }", which are opaque JSON values passed through by the implementation, with no schema enforcement.

### Security

Expand Down
95 changes: 60 additions & 35 deletions routing/DELEGATED_ROUTING_HTTP.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,78 +24,103 @@
- [Implementations](#implementations)

# API Specification
By default, the Delegated Routing HTTP API uses the `application/json` content type. Clients and servers may optionally negotiate other content types such as `application/cbor`, `application/vnd.ipfs.rpc+dag-json`, etc. using the standard `Accept` and `Content-Type` headers.
The Delegated Routing HTTP API uses the `application/json` content type by default. Clients and servers *should* support `application/cbor`, which can be negotiated using the standard `Accept` and `Content-Type` headers.

## Common Data Types:

- CIDs are always encoded using a [multibase](https://github.com/multiformats/multibase)-encoded [CIDv1](https://github.com/multiformats/cid#cidv1).
- Multiaddrs are encoded according to the [human-readable multiaddr specification](https://github.com/multiformats/multiaddr#specification)
- Peer IDs are encoded according [PeerID string representation specification](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation)
- Multibase bytes are encoded according to [the Multibase spec](https://github.com/multiformats/multibase), and *should* use Base64.

## API
- `GET /v1/providers/{CID}`
- Reframe equivalent: FindProviders
- Response

```json
{
"Providers": [
{
"PeerID": "...",
"Multiaddrs": ["...", "..."]
"Protocols": [
{
"Codec": 2320,
"Payload": <opaque data>
}
]
}
]
"NextPageToken": "<token>"
"Providers": [
{
"PeerID": "12D3K...",
"Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."],
"Protocols": [
{
"Codec": 2320,
"Payload": { ... }
}
]
}
]
}
```

- Default limit: 100 providers
- Optional query parameters
- `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of multicodec IDs such as `2304,2320`,
- `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `/quic` or `/tls/ws`.
- `GET /v1/providers/hash/{multihash}`
- This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a multihash
- `transfer` only return providers who support the passed transfer protocols, expressed as a comma-separated list of [multicodec codes](https://github.com/multiformats/multicodec/blob/master/table.csv) in decimal form such as `2304,2320`
- `transport` only return providers whose published multiaddrs explicitly support the passed transport protocols, such as `460,478` (`/quic` and `/tls/ws`)
- Servers should treat the multicodec codes used in the `transfer` and `transport` parameters as opaque, and not validate them, for forwards compatibility
- `GET /v1/providers/hashed/{multihash}`
- This is the same as `GET /v1/providers/{CID}`, but takes a hashed CID encoded as a [multihash](https://github.com/multiformats/multihash/)
- `GET /v1/ipns/{ID}`
- Reframe equivalent: GetIPNS
- `ID`: multibase-encoded bytes
- Response
- record bytes
- `POST /v1/ipns/{ID}`
- `POST /v1/ipns`
- Reframe equivalent: PutIPNS
- Body
- record bytes
- No need for idempotency
- `PUT /v1/providers/{CID}`
```json
{
"Records": [
{
"ID": "multibase bytes",
"Record": "multibase bytes"
}
]
}
```
- Not idempotent (this doesn't really make sense for IPNS)
- Default limit of 100 records per request
- `PUT /v1/providers`
- Reframe equivalent: Provide
- Body

```json
{
"Signature": "multibase bytes",
"Payload": "multibase bytes"
}
```
- `Payload` is a multibase-encoded JSON object with the following schema:
```json
{
"Keys": ["cid1", "cid2"],
"Timestamp": 1234,
"AdvisoryTTL": 1234,
"Signature": "multibase bytes",
"Provider": {
"Peer": {
"ID": "peerID",
"Addrs": ["multiaddr1", "multiaddr2"]
},
"Protocols": [
"PeerID": "12D3K...",
"Multiaddrs": ["/ip4/.../tcp/.../p2p/...", "/ip4/..."],
"Protocols": [
{
"Codec": 1234,
"Payload": <opaque data>
"Payload": { ... }
}
]
}
}
```

- `Signature` is a multibase-encoded signature of the bytes of the `Payload` field, signed using the private key of the Peer ID specified in the `Payload`. See the [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#keys) specification for the encoding of Peer IDs. Servers must verify the payload by deserializing it, extracting the public key from the Peer ID, and verifying the signature against the `Payload` bytes. If the verification fails, the server must return a 403 status code.
- Idempotent
- Default limit of 100 keys per request
- `GET /v1/ping`
- This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap
- This is absent from Reframe but is necessary for supporting e.g. the accelerated DHT client which can take many minutes to bootstrap, and light clients who want to probe multiple HTTP endpoints and use the fastest one
- Returns 200 once the server is ready to accept requests
- An alternate approach is w/ an orchestration dance in the server by not listening on the socket until the dependencies are ready, but this makes the “dance” easier to implement
- Pagination
- Limits
- Responses with collections of results must have a default limit on the number of results that will be returned in a single response
- Servers may optionally implement pagination by responding with an opaque page token which, when provided as a subsequent query parameter, will fetch the next page of results.
- Clients may continue paginating until no `NextPageToken` is returned.
- Clients making calls that return collections may limit the number of per-page results returned with the `limit` query parameter, i.e. `GET /v1/providers/{CID}?limit=10`
- Additional filtering/sorting operations may be defined on a per-path basis, as needed
- Pagination and/or dynamic limit configuration may be added to this spec in the future, once there is a concrete requirement
- Error Codes
- A 404 must be returned if a resource was not found
- A 501 must be returned if a method is not supported

0 comments on commit a94bfb2

Please sign in to comment.