Skip to content

Commit

Permalink
Flesh out the IPFS MSC, add pinning support, add alternatives
Browse files Browse the repository at this point in the history
  • Loading branch information
DarkKirb committed May 30, 2023
1 parent 5131f0f commit d8516e5
Showing 1 changed file with 43 additions and 11 deletions.
54 changes: 43 additions & 11 deletions proposals/2706-IPFS.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,22 @@ mostly focuses on introducing IPFS in a backwards compatible manner.

**TODO: Decide if not using `ipfs://` is a mistake.**

IPFS uses "content IDs" (or "cid") to reference media which are compatible with Matrix's media IDs (**TODO: CONFIRM**),
making the process even easier to migrate. To support backwards compatability with older clients
IPFS uses "content IDs" (or "cid") to reference media, the URL-safe[^1] representations being compatible with MXC URIs.
This makes the process even easier to migrate. To support backwards compatibility with older clients
and servers, the media ID is proposed to be formatted as `ipfs:<cid>` for IPFS-hosted media. This
will allow legacy servers and clients to contact their homeserver and resolve it to an IPFS gateway
to be served while indicating to supporting implementations that they do not need to contact the
origin server and can instead use IPFS directly to retrieve the media.

[^1]: URL-safe characters being characters you can use in the path component of a URI without escaping, c.f. Section 3.3 of RFC 2396. In particular, the identity, base64, base64pad, and base256emoji representations of CIDv1s are not supported.

For completeness, an example IPFS-styled MXC URI would be `mxc://example.org/ipfs:cidgoeshere`.

Because clients can embed an IPFS node into themselves or [access IPFS from the browser](https://github.com/ipfs/in-web-browsers/blob/master/ADDRESSING.md),
it would be extremely useful to allow the client to bypass the `/upload` endpoint and publish its
own MXC URI after having used a local IPFS node. Considering `ipfs://` support is not proposed here,
clients will need to get a homeserver name/origin to put into the `mxc://` URI. They'll also need to
know if the server even supports IPFS to be able to bypass `/upload` entirely.
Because clients can embed an IPFS node into themselves or [access IPFS from the browser](https://github.com/ipfs/in-web-browsers/blob/master/ADDRESSING.md), a media repo implementing this MSC should
allow the client to bypass the `/upload` endpoint, and allow the client to pin a CID to the media server’s IPFS node.
Considering `ipfs://` support is not proposed here, clients will need to get a homeserver name/origin to put into the `mxc://` URI. They'll also need to know if the server even supports IPFS to be able to bypass `/upload` entirely.

Desktop and especially mobile clients are not recommended to skip the `/upload` endpoint by default, see the Potential Issues and Security considerations section below.

To permit the bypass of `/upload`, a new capability is proposed: `m.ipfs`. When present, this indicates
to the client that the server's media repo is IPFS-capable and thus can be bypassed. Clients will still
Expand All @@ -47,6 +49,25 @@ determine an appropriate origin:
2. The origin specified by the optional `preferred_origin` in the `m.ipfs` capability.
3. The domain name for the user's ID, as a default option.

### `POST /_matrix/media/v3/pin/{cid}`

Uploads a media file to media repo by CID. Like `/upload`, this endpoint requires
auth, can be rate limited, and returns the `content_uri` that can be used in
events.

The request body should be an empty JSON object, but can be extended by other MSCs.

This call ensures that the media file that is being sent by a client continues to be available, even
if other nodes no longer know about the data.

If the CID cannot be resolved within 60s, an `M_NOT_FOUND` error is returned. The client may retry
the call in that event.

For other errors, such as file size, file type or user quota errors, the normal
`/upload` rules apply.

Media Retention, quarantining, and redaction rules a server software may provide, apply to media uploaded through this endpoint.

----

This proposal does encourage that client implementation embed IPFS support to avoid having to contact
Expand All @@ -59,19 +80,30 @@ homeserver.

**TODO: Investigate ways to mitigate.**

* Retention and redaction, erasure.
* Spam, abuse, etc
* Quarantining content (not currently specified, but should be considered).
* Media cannot be fully removed, as while you can block it from the media repo, you cannot guarantee that something is removed from IPFS.
* The Download API can block CIDs from being downloaded through it, however clients can still fetch it via IPFS. This also applies to quarantining.
* When skipping the `/upload` endpoint, a potentially large amount of clients may try to download the media through IPFS. This might require several times more bandwidth than traditionally uploading.
* The IPFS node is intended to be run continuously in the background. This is, simply put, incompatible with the computing models of smartphones.
* IPFS takes some time to find peers, especially when they are behind NAT.

## Alternatives

**TODO: Find other solutions than IPFS and explain why they're bad.**

* [MSC2846](https://github.com/matrix-org/matrix-spec-proposals/pull/2846) proposes that the Media ID is replaced with a CID, however doesn’t propose any IPFS interoperability. Another difference is that the CID isn’t namespaced.
* [MSC2834](https://github.com/matrix-org/matrix-spec-proposals/pull/2834) is an abandoned proposal for changing the Media ID format to be a hash of the contents. This avoids a dependency on an additional external protocol.
* There are also a number of other peer-to-peer file sharing protocols. Many of the issues mentioned here likely apply to them too, with tradeoffs wrt maturity, software support, network support, speed of discovery, etc.

## Security considerations

**TODO: This.**

* IPFS is not a private protocol. The IP addresses of the peers are public. This is a privacy concern on clients, but might also affect servers, especially ones behind reverse-proxies and load balancers. It can be mitigated by using a remote IPFS node.
* As IPFS files are inherently public, any MSC that adds access control to individual media files would not be compatible with ipfs links.
* CIDs reveal some informations about the data in the file, in most cases some form of hash, however for small files CIDs can embed the file whole. Either way, the full file is accessible knowing only its CIDs.

## Unstable prefix

While this MSC is not in a released version of the spec, `io.t2bot.ipfs` should be used in place of
`m.ipfs`. No special endpoints, version flags, or other prefixes are required for this MSC.
`m.ipfs`. The pin endpoint should be named `/_matrix/media/unstable/rs.chir.matrix.msc2706/pin/{cid}`.
No special version flags, or other prefixes are required for this MSC.

0 comments on commit d8516e5

Please sign in to comment.