Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta: Specification lies/omissions/ambiguity #182

Open
kegsay opened this issue Apr 28, 2020 · 18 comments
Open

Meta: Specification lies/omissions/ambiguity #182

kegsay opened this issue Apr 28, 2020 · 18 comments

Comments

@kegsay
Copy link
Member

kegsay commented Apr 28, 2020

This issue tracks all issues we've found in the spec where the spec is:

  • Wrong.
  • Ambiguous.
  • Not phrased as well as it could be.

If people are feeling particularly inclined, please PR on https://github.com/matrix-org/matrix-doc/pulls these issues and xref them here so we know it's done. Thanks!

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Federation should always be referring to the state before/after an event, not "at" an event

Specifically for the PDU checks on https://matrix.org/docs/spec/server_server/latest#checks-performed-on-receipt-of-a-pdu

Passes authorization rules based on the state at the event, otherwise it is rejected.

  • /state_ids is BEFORE the event is applied, you need to hit /event/{eventID} if you want to apply it.

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Backfill clarity GET /_matrix/federation/v1/backfill/{roomId}

Currently it just says:

Retrieves a sliding-window history of previous PDUs that occurred in the given room. Starting from the PDU ID(s) given in the v argument, the PDUs that preceded it are retrieved, up to the total number given by the limit.

We should add that:

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Terminology of dropped vs rejected

PDU checks mention this https://matrix.org/docs/spec/server_server/latest#checks-performed-on-receipt-of-a-pdu but do not clarify what they mean. @richvdh says:

  • dropped: ignored altogether
  • rejected: stashed in the database with a "rejected" flag

basically if the signature is good, no amount of re-requesting it from other sources is going to make it get better
whereas if it's not a valid signed event, it could just be a random server talking crap

When asked why it is stored in the database, @erikjohnston replied:

It can still be referenced in the DAG by other servers
But also, just because the event is rejected doesn't mean its useless
(eg the prev events might still be useful for ordering purposes, and for figuring out state of any events pointing at it)

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Clarify GET /_matrix/federation/v1/event_auth/{roomId}/{eventId} and "auth chain"

It currently returns an auth_chain which is []PDU with the following explanation:

The full set of authorization events that make up the state of the room, and their authorization events, recursively. Note that events have a different format depending on the room version - check the room version specification for precise event formats.

The phrase "that make up the state of the room" is unclear and made me think it was the entire room state. In actuality, it just means the auth chain for the {eventId} specified.

"Auth chain" is also poorly defined, which aiui is:

each event has a set of auth_events, and those events have their own auth_events recursively to the start of the room. The auth_chain is the full set of these events.

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Clarify behaviour of "soft failed"

The spec says two contradictory statements:

When an event is "soft failed" it should not be relayed to the client

and

Because soft failed state events participate in state resolution as normal, it is possible for such events to appear in the current state of the room. In that case the client should be told about the soft failed event in the usual way (e.g. by sending it down in the state section of a sync response).

@erikjohnston clarified with:

Yeah, the server ignores soft failed events as much as possible, unless another valid event comes in which references it and state res on the new forward edges include it
So: it's included in state if the current state of the room has the event in
But that should only happen if some other event comes along and points at it
(as soft failed events don't update the forward extremities of the room)

What remains unclear is that does this means all non-state events marked as soft-failed are NEVER sent to the client (as only state events which pass state res can be sent to the client)?

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Clarify why GET /_matrix/key/v2/query/{serverName}/{keyId} returns a list of ServerKeys

It is not a batch endpoint, and current/expired keys are bundled into the object ServerKeys so it's strange that it returns [Server Keys]. Hitting this endpoint on matrix.org returns a single element:

GET https://matrix.org/_matrix/key/v2/query/matrix.org/ed25519:auto

{
  "server_keys": [
    {
      "old_verify_keys": {
        "ed25519:auto": {
          "expired_ts": 1576767829750,
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      },
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:a_RXGa": "LTpifn97TqI3voj2KVo9V84B/2+159Z9Pe+hZ0N0SvmBTw+kno8dsDaPocuy/3KP5gWl/GS7pZetdq/eVL6HCw",
          "ed25519:auto": "Akr/kDr7GjTd1D42MKyrNsYkceiRVy9qCdYR+JDlMR1o5GsNur1tlIESTSGVmVvhn+SZXxOmyDme2AQoPJ7gBA"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "Pq3kW4n2StFnOFtA0PTiXIk0HZF6jwLyR8tMlaRko6I"
        },
        {
          "sha256": "mVNAmfjFleUpJseTNjU5EhJAlfl5rjb7JJPG2UeH9eY"
        }
      ],
      "valid_until_ts": 1588133666152,
      "verify_keys": {
        "ed25519:a_RXGa": {
          "key": "l8Hft5qXKn1vfHrg3p4+W8gELQVo8N13JkluMfmn2sQ"
        }
      }
    }
  ]
}

but note that the {keyId} form is deprecated:

Deprecated. Servers should not use this parameter and instead opt to return all keys, not just the requested one. The key ID to look up. When excluded, the trailing slash on this endpoint is optional.

So using the same URL without the keyID:

GET https://matrix.org/_matrix/key/v2/query/matrix.org

{
  "server_keys": [
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "4axC5ChqFENZAcPPJt7U+DHaYvDevBdBedvaqQ4pzos3+5gKKJZPqd1nYktpOf0mxE5r9mYNECSvaRFDhXrVCg"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "MZh6BVtbX/L5pr7sP9XZWvI9725rdjZFBP0qtSRrH3o"
        }
      ],
      "valid_until_ts": 1498635088048,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "u9ULjmMRDkPaljVEf+htL1PU35v2KbdNR9Xwqk5qIZu1JHbeXjAAdBv0+z2yy7a/qTb2XIZrjNBcKYog35PBDQ"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "FpH5W+0ReH5l8wDk9pG0yZ2hEgV2ccviX2q4h74o7JQ"
        }
      ],
      "valid_until_ts": 1472788820145,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "10ypeNZl3ND8KwZQC7h7gwqRtRXHCCZ/65vTUXG7l4XrB6uy5Ewc6qzxS3kuUTF2K2ioOtSj4Y3eMdfLm4rgAg"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "2ccnUfE5U9NBUkQjN1gSD0J4cMXDtcADC+d5Bg5n/CA"
        },
        {
          "sha256": "Pq3kW4n2StFnOFtA0PTiXIk0HZF6jwLyR8tMlaRko6I"
        }
      ],
      "valid_until_ts": 1555541841432,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {
        "ed25519:auto": {
          "expired_ts": 1576767829750,
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      },
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:a_RXGa": "LTpifn97TqI3voj2KVo9V84B/2+159Z9Pe+hZ0N0SvmBTw+kno8dsDaPocuy/3KP5gWl/GS7pZetdq/eVL6HCw",
          "ed25519:auto": "Akr/kDr7GjTd1D42MKyrNsYkceiRVy9qCdYR+JDlMR1o5GsNur1tlIESTSGVmVvhn+SZXxOmyDme2AQoPJ7gBA"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "Pq3kW4n2StFnOFtA0PTiXIk0HZF6jwLyR8tMlaRko6I"
        },
        {
          "sha256": "mVNAmfjFleUpJseTNjU5EhJAlfl5rjb7JJPG2UeH9eY"
        }
      ],
      "valid_until_ts": 1588133666152,
      "verify_keys": {
        "ed25519:a_RXGa": {
          "key": "l8Hft5qXKn1vfHrg3p4+W8gELQVo8N13JkluMfmn2sQ"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "LeveNZF4jvgpNCRYae2TupxDfKUjYTyWbNX4LyY0secm2d8BI2yyr3r4PDP08Yc8JPuEt0kDHHA3BCEhW/QhBA"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "2ccnUfE5U9NBUkQjN1gSD0J4cMXDtcADC+d5Bg5n/CA"
        },
        {
          "sha256": "Pq3kW4n2StFnOFtA0PTiXIk0HZF6jwLyR8tMlaRko6I"
        }
      ],
      "valid_until_ts": 1555439072607,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "fOgffsLy4qywa47T0d4Gz1/HQolFKdwhfhbOG7A9o7o01z9d7fHzgfAMpg365irvYCEgP5TpHxFFCO2B2fxzBA"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "B4l6oDCCmZXmF10fNF2NDGeCYxwfVyB1QpH3iygDVKI"
        }
      ],
      "valid_until_ts": 1540369207431,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "ZGWZXZm7Kq31Dttr55ykAeuONt36Pc7RMFP+tCbGgoJo2wvRC7/h0alH+eciMI56TLVBhMwpMpV+cuhCia88DQ"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "CxYXO/4joibOSqqHvd8TL2BgfvWBA7lArjR20pzETfE"
        },
        {
          "sha256": "t68Cp7Frh0koNB81PRwlocTPfLBvvO7z4o5fHl8fISg"
        },
        {
          "sha256": "B4l6oDCCmZXmF10fNF2NDGeCYxwfVyB1QpH3iygDVKI"
        }
      ],
      "valid_until_ts": 1547729839264,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "I2y5PqN+cvSEn+yle0UP0ho0WSpPeXF0XDpCPdbix48VdcFeHF+0TJc9RvJevo+Bv/HdUX0E+LbDybSYpqKxDg"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "MZh6BVtbX/L5pr7sP9XZWvI9725rdjZFBP0qtSRrH3o"
        }
      ],
      "valid_until_ts": 1488935365727,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    },
    {
      "old_verify_keys": {},
      "server_name": "matrix.org",
      "signatures": {
        "matrix.org": {
          "ed25519:auto": "fp2dvwFuIUPnxaLMCZ1Uk6no3VfhrnYkA7Eu/NPy0wcp5R5mqXXyyBw1tMv/BeVhoc71HiO84jBfDfdVeE4zAg"
        }
      },
      "tls_fingerprints": [
        {
          "sha256": "MZh6BVtbX/L5pr7sP9XZWvI9725rdjZFBP0qtSRrH3o"
        }
      ],
      "valid_until_ts": 1480552391084,
      "verify_keys": {
        "ed25519:auto": {
          "key": "Noi6WqcDj0QmPxCNQqgezwTlBKrfqehY1u2FyWP9uYw"
        }
      }
    }
  ]
}

This seems like a complete mess. Clarification needed.

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Clarify how events should be fetched in /backfill requests

If there are multiple prev_events, do we BFS return or DFS return? I'm guessing BFS in the order they appear in prev_events? The very similar endpoint /get_missing_events is a bit clearer:

Retrieves previous events that the sender is missing. This is done by doing a breadth-first walk of the prev_events for the latest_events, ignoring any events in earliest_events and stopping at the limit.

but it still doesn't tell you which prev event order to use in order to be deterministic (e.g lexicographically)

@kegsay
Copy link
Member Author

kegsay commented Apr 28, 2020

Clarify which events can be returned in /backfill requests

Are outliers allowed to be returned? Are disjoint sections of the DAG allowed to be returned? Or is it just literally the prev_events and their prev_events and nothing more? If an event has 3 prev_events and I only have 1 of those, can I return that 1 and its prev_events or am I forced to serve in exactly BFS order?

@matrix-org matrix-org deleted a comment from kegsay Apr 29, 2020
@kegsay
Copy link
Member Author

kegsay commented May 5, 2020

Clarify which limits receiving servers can modify

Both backfill and get_missing_events accept a limit param. The spec does not state that the receiving server is able to modify that and return less than the limit, whilst still having more events, yet that appears to be the case eg https://github.com/matrix-org/synapse/blame/229eb81498b0fe1da81e9b5b333a0285acde9446/synapse/handlers/federation.py#L2166

It's worth noting that at least 1 sytest REQUIRES that a blunt limit=100 is used as the test doesn't implement /state_ids and 100 event is enough to get back to the create event. :/

@neilalexander
Copy link
Contributor

neilalexander commented May 7, 2020

DMs are really light on details in the spec

The CS API spec defines is_direct for direct message rooms, but it really says very little else about what invite_stripped_state should be included with the invite. (It turns out Riot gets upset if the invite_room_state doesn't contain the invitee's membership event and crashes, for example.)

@kegsay
Copy link
Member Author

kegsay commented Jun 1, 2020

Room alias grammar

Specifically, there is none. However, Synapse seems to disallow whitespace cf https://github.com/matrix-org/synapse/blob/6b22921b195c24762cd7c02a8b8fad75791fce70/synapse/handlers/directory.py#L68 - we need to write up a grammar, even if it's just "EVERYTHING IS ALLOWED :DDDD"

@kegsay
Copy link
Member Author

kegsay commented Jun 17, 2020

Clarify when to return errors on federation /send

Specifically we SHOULD NOT do this for 'rejected' events aka ones which:

  • Passes authorization rules based on the event's auth events, otherwise it is rejected.
  • Passes authorization rules based on the state at the event, otherwise it is rejected.

Synapse will return an error for a PDU for:

  • Banned servers
  • Generic exceptions
  • Failed hash checks
  • Failed signature checks
  • Too large / too many prev_events / cannot find prev_events

@kegsay
Copy link
Member Author

kegsay commented Jul 21, 2020

E2E keys: spec should never state 'object' without any extra info

failures | {string: object} | If any remote homeservers could not be reached, they are recorded here. The names of the properties are the names of the unreachable servers. If the homeserver could be reached, but the user or device was unknown, no failure is recorded. Instead, the corresponding user or device is missing from the device_keys result.

What is object? Who knows! According to Synapse:

def _exception_to_failure(e):
    if isinstance(e, SynapseError):
        return {"status": e.code, "errcode": e.errcode, "message": str(e)}

    if isinstance(e, CodeMessageException):
        return {"status": e.code, "message": str(e)}

    if isinstance(e, NotRetryingDestination):
        return {"status": 503, "message": "Not ready for retry"}

    # include ConnectionRefused and other errors
    #
    # Note that some Exceptions (notably twisted's ResponseFailed etc) don't
    # give a string for e.message, which json then fails to serialize.
    return {"status": 503, "message": str(e)}

This clearly has structure and hence should be in the specification.

@kegsay
Copy link
Member Author

kegsay commented Aug 5, 2020

Federated device list updates stream IDs

Should be made clear that this is NOT a monotonically increasing integer (though really it should be!). https://matrix.org/docs/spec/server_server/latest#device-management

@kegsay
Copy link
Member Author

kegsay commented Aug 6, 2020

If a server receives an EDU which refers to a prev_id it does not recognise, it must resynchronise its list by calling the /user/keys/query API and resume the process. The response contains a stream_id which should be used to correlate with subsequent m.device_list_update EDUs.

This is not true. https://matrix.org/docs/spec/server_server/latest#get-matrix-federation-v1-user-devices-userid returns a stream ID, not https://matrix.org/docs/spec/server_server/latest#post-matrix-federation-v1-user-keys-query which is annoying as it doesn't allow batching of requests for multiple users unlike /keys/query :/

@neilalexander
Copy link
Contributor

Handling invalid state events in /send_join response

The spec doesn't say what to do when the /send_join response contains invalid state events, e.g. due to a bad signature. Dendrite currently fails to validate the entire join but perhaps we should just be dropping those affected events?

@kegsay
Copy link
Member Author

kegsay commented Aug 12, 2020

Federation: E2E key APIs are fiddly and not optimal

There are two ways you can fetch a snapshot of a remote user's device keys:

  • /user/keys/query : Request is a bunch of user ID -> Device IDs. This is good as it allows batching. Response includes keys but no stream_id.
  • /user/devices/{userID} : Request is a single user ID, no batching. Response includes a stream_id.

Ideally there would just be 1 endpoint which allowed for batching AND returned a stream_id for entries, but there isn't. Because of this:

  • Server developers have to write additional endpoints.
  • Server developers have to satisfy sytest which sometimes awaits on one endpoint and sometimes the other.
  • Server developers have to know which endpoint will be called in which situation, and it's not clear when this is. /user/devices/{userID} has all the data but is inefficient.

The actual cost to me personally is that I've wasted a day trying to make sytests pass and I'm at the point where I'm failing to get this all to work because some tests expect certain outbound calls.

Proposal:

  • Add stream_id to the response for /user/keys/query.
  • Deprecate /users/devices/{userID}.

@kegsay
Copy link
Member Author

kegsay commented Aug 2, 2021

Cross-Signing

Redundant base 64 key data - see matrix-org/dendrite#1953 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants