Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multigram draft 1 #123

Closed
wants to merge 3 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions multigram/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Multigram -- protocol negotiation and multiplexing over datagrams

For multiplexing different protocols on the same datagram connection, multigram prepends a 1-byte header to every packet. This header represents an index in a table of protocols shared between both endpoints. This protocol table is negotiated by exchanging the intersection of the endpoint's supported protocols. The protocol table's size of 256 tuples can be increased by nesting multiple multigram headers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a varint instead of a single byte?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing is not so much a problem, but the per-packet overhead should be constant -- for some cases, we'll want to inform the upper protocols about the overhead, so they can avoid fragmentation, or batch multiple small messages into one packet.

We could fit a larger table into to 2 bytes, but I expect that a connection with more than 255 protocols on the same layer will be extremely rare. These two bytes multiply the deeper you nest protocols, in IPFS we currently have 3 layers of multistream I think?

Does this make any sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum packet size would be bubbled up from the transports through all the protocol wrappers, e.g. 512 (UDP) - 1 (multigram) - 20 (cryptoauth) - 1 (multigram) - 12 (switch) - 1 (multigram) = 477

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm... okay. I see the importance of the single byte, but you do mention that you can 'expand' the table by nesting. Nesting seems more complicated than using a varint and informing other layers how much space they have left to consume

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with a good night's sleep, you're absolutely right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To explain what my concern was: I was thinking the varint width would vary on a packet-per-packet basis, while it'll really only change if you add protocols to the protocol table (i.e. change the multigram connection's state).

Copy link
Member

@Kubuxu Kubuxu Jul 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think varint will make packet handling more complex then it should be, you introduce offset handling and unframing that depends on the state of connection.
If the state goes desync for what ever reason getting it back to sync will be extremely hard.
Having constant header file is very nice feature.

Also 255 overlay protocols (that are negotiated on need basis should be more than enough, if more is needed, layering is still an option).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh yep we're trading off between a complication of header parsing, and a complication of protocol semantics. We should decide whether we favor performance (int8 field and nested multigrams) over a simpler protocol (varint and no nesting).

We should figure out exactly how complicated nesting is (make experiments), and also how heavy parsing a varint is vs. an int8 (benchmarks). My gut feeling is that we should probably go with an int8, because we'll have to do this for every packet, and multiple times per packet in many use cases.

Not-so-appealing idea: a varint degrades to an int8 nicely, so there might a way to leave it up to implementations.


TODO: analyze the properties vs. other approaches (e.g. an identifier on every packet).

# Packet layout

```
1 2 3 4
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 | Table Index | |
+-+-+-+-+-+-+-+-+ Variable-length data +
4 | |
+
```

## Protocol table

The protocol table MUST be append-only and immutable. It MUST initially contain exactly one tuple:

```
0x00,/multigram-setup/0.1.0
```

The `/multigram-setup` protocol is used for appending to the shared protocol table.

```
1 2 3 4
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 | 0x00 | |
+-+-+-+-+-+-+-+-+ Operation as Multicodec +
4 | |
+
```

- Note how the Table Index field is set to `0x00`, selecting the `/multigram-setup` protocol.
- The Operation field MUST support at least the `/cbor/` multicodec format. It SHOULD support the `/protobuf/` and `/json/` formats.
Copy link
Member

@Kubuxu Kubuxu Jul 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see bencode as main protocol for it simplicity which fits the data we will be sending in this protocol. CBOR, protobuf and JSON definitions are all tenths of pages long. In contrast bencode is both short to define and to write parser for (most implementations are sub 500 LOC).

http://jonas.nitro.dk/bittorrent/bittorrent-rfc.html#anchor7

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should totally discuss which formats MUST and SHOULD be included, and I agree that bencode should be in there somewhere.

- In the future, an `/ipfs/` format can be used to resolve code and specification for supporting the format.

## Protocol table operations / multigram-setup

Either endpoint can send `append` proposals, and the other endpoint will reply with the result, based on their own supported protocols.

(1) Endpoint A sends a proposal to Endpoint B. A doesn't commit this proposal to its own view of the protocol table. It dismisses it right away and waits for B's reply.
```
0x00
/json/
{"0x01":"/foo/1.0.0","0x02":"/bar/1.0.0","0x03":"/baz/1.0.0"}
```

(2) Endpoint B forms the intersection of this proposal and its own supported protocols, appends to its own view of the protocol table, and replies.
Copy link
Member

@Kubuxu Kubuxu Jul 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see with this is how we react when two sides start appending at the same time?
Also as we will be using it over bare UDP so we should deal with datagram loss and reordering.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint should respond with some kind of error if it gets a proposal with conflicts with its own pending proposal. So these two would cancel each other out.

```
0x00
/json/
{"0x01":"/foo/1.0.0","0x02":"/bar/1.0.0"}
```

(3) We can list the table by sending the other endpoint an empty proposal.
```
0x00
/json/
{}
```

(4) The protocol table now contains `0x00`, `0x01`, and `0x02`.
```
0x00
/json/
{"0x01":"/foo/1.0.0","0x02":"/bar/1.0.0"}
```

Any field in the operation data starting in `0x` is considered for being appended.
Any other field names can be used e.g. for checksums, detecting packet loss, or communicating errors.
This is subject to updates of the multigram-setup protocol.

## Nested multigrams

The protocol tables of nested multigrams can be set up within one packet.
This works because any trailing data will be processed after the table operation.

```
0x00
/json/
{"0x01":"/multigram/0.1.0"}
0x0100
/json/
{"0x01":"/multigram/0.1.0"}
0x010100
/json
{"0x01":"/ipfs/identify/1.0.0","0x02":"/fc00/iptunnel/0.1.0","0x03":"/fc00/pathfinder/0.1.0"}
```

Packets starting in `0x010103` would belong to `/fc00/pathfinder`.