Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2228: Self destructing events #2228

Open
wants to merge 15 commits into
base: old_master
Choose a base branch
from
187 changes: 187 additions & 0 deletions proposals/2228-self-destructing-events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Proposal for self-destructing messages

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion on this:

  1. One reason one might want this to exist is to hide every message in a room automatically once it has been read by all participants. Wouldn't it make more sense to have self-destruct settings in the rooms state instead of each event?
  2. Another reason is that it frees up the timeline so you don't get distracted by old messages and focus on the current conversation. This would make more sense client-side though.
  3. If one wants this feature because of performance, other valid options would be to add an option to disable search indexing for a room. Performance won't improve by redacting messages, because the old events are still around and there will be another redaction event for every message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our usecase requires only a subset of messages to expire; hence per-event, rather than per-room.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both approaches have their merits. How about the following: the timers are still per message, but their default values can be configured in the room's state. When a client send a "normal" message, it will use the room's default. When the user sends a message through some special "timed message" UI, it may use a custom timer.

With this, one could have some general data cleanup on a large time scale (e.g. four weeks of backlog) but keep the possibility of sending short-lived images etc.

This needs not be addressed in this MSC though, as it can easily be added afterwards if desired.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just an interested member of the public, but @piegamesde's approach sounds good to me and would meet my needs. It's similar to what Keybase does.


It's useful for users to be able to send sensitive messages within a
conversation which should be removed after the target user(s) has read them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conversation which should be removed after the target user(s) has read them.
conversation which should be removed after the target users have read them.

This can be achieved today by the sender redacting the message after the
receipient(s) have read them, but this is a tedious manual process. This
proposal provides a way of automating this process.

Originally [MSC1763](https://github.com/matrix-org/matrix-doc/pull/1763)
attempted to solve this by applying retention limits on a per-message basis
and purging expired messages from the server; in practice this approach is
flawed because purging messages fragments the DAG, breaking back-pagination
and potentially causing performance problems. Also, the ability to set an
expiration timestamp relative to the send (rather than read) time is not of
obvious value. Therefore the concept of self-destructing messsages was
split out into this independent proposal.

## Proposal
ara4n marked this conversation as resolved.
Show resolved Hide resolved

Users can specify that a message should self-destruct by adding one or more of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you define messages here? Events of type m.room.message? If not, the MSC should probably state that this proposal doesn't apply to state events.

the following fields to any event's content:

`m.self_destruct`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this duration relative to? The original message or the last read receipt? I think since this is moving to per-user we need to update to be relative to the read receipt of each user.

the duration in milliseconds after which the participating servers should
turt2live marked this conversation as resolved.
Show resolved Hide resolved
redact this event on behalf of the sender, after seeing an explicit read
receipt delivered for the message from all users in the room. Must be null
or an integer in range [0, 2<sup>53</sup>-1]. If absent, or null, this
behaviour does not take effect.

`m.self_destruct_after`:
the timestamp in milliseconds since the epoch after which participating
richvdh marked this conversation as resolved.
Show resolved Hide resolved
servers should redact this event on behalf of the sender. Must be null
or an integer in range [0, 2<sup>53</sup>-1]. If absent, or null, this
behaviour does not take effect.
Comment on lines +23 to +34
Copy link
Contributor

@kevincox kevincox Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names aren't very clear to me. Also why call it "self_destruct" when it issues a redaction? We should keep the terminology consistent. How about redact_after and redact_after_read


Clients and servers MUST send explicit read receipts per-message for
self-destructing messages (rather than for the most recently read message,
as is the normal operation), so that messages can be destructed as requested.
Comment on lines +36 to +38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stipulation is a bit puzzling to me, because we make the point of trying to paper over lack of support from clients by synthesising redactions.

However, if we require clients to acknowledge each individual self-destructing event, we have already lost backwards compatibility, so there'd be no point in synthesising redactions.

(on a tangent, I'm not sure why this mentions servers sending read receipts — when do servers send read receipts anyway?)

In my opinion we should remove this stipulation and aim for backwards compat (i.e. allow the read receipt to be for the most recently read message, as per usual).


The `m.self_destruct` fields are not preserved over redaction (and
self-destructing messages may be redacted to speed up the self-destruct
process if desired).

The `m.self_destruct` fields must be ignored on `m.redaction`events, given it
should be impossible to revert a redaction.

E2E encrypted messages must store the `m.self_destruct` fields outside of the
encrypted contents of the message, given the server needs to be able to act on
it.

Senders may edit the `m.self_destruct` fields in order to retrospectively
change the intended lifetime of a message. Each new `m.replaces` event should
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSC1849 mentions m.replace, not m.replaces.

be considered to replace the self-destruction information (if any) on the
original, and restart the destruction timer. On destruction, the original
event (and all `m.replaces` variants of it) should be redacted.

## Server-side behaviour

When a client sends a message with `m.self_destruct` information, the servers
participating in a room should start monitoring the room for read receipts for
the event in question.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should each homeserver redact the message once all of it's users have read the message? (The sender would have implicitly have ready the message upon sending) Right now it seems that m.self_destruct_after is effectively required if you actually want servers to delete the message at some point. It would be nice if there was an option to only use the read-based destruction.


Once a given server has received a read receipt for this message from a member
in the room (other than the sender), then the message's self-destruct timer
should be started for that user. Once the timer is complete, the server
should redact the event from that member's perspective, and send the user a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sending this synthetic redaction will mean we have to generate an event ID for it. That part is easy, but what happens if the client then tries to use that fake event's ID for anything such as:

  • replying to it
  • /_matrix/client/v3/rooms/{roomId}/context/{eventId}
  • /rooms/{roomId}/event/{eventId}
  • /_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}
  • /rooms/{roomId}/relations/{eventId} (and friends)
  • POST /_matrix/client/v3/rooms/{roomId}/receipt/{receiptType}/{eventId}
  • perhaps some others I missed, but these are all I can see for now

Most of these may be fairly easy to handle but it's still worth noting as extra implementation work.

The idea of synthetic events may still make good sense, but we should be aware of what we're signing up for — it's not quite as simple as just jamming something in /sync. On the Synapse side, after seeing this I would be tempted to implement a generic synthetic event mechanism in case we re-use this idea in the future — do we expect we may do?

synthetic `m.redaction` event in the room to the reader's clients on behalf of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me whether a "synthetic redaction event" (or a synthetic event in general) is a new concept introduced by this MSC, and whether it differs from a standard redaction event other than having an additional m.synthetic: true flag.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
synthetic `m.redaction` event in the room to the reader's clients on behalf of
synthetic `m.room.redaction` event in the room to the reader's clients on behalf of

I can't find any reference for m.redaction in the spec, do you mean m.room.redaction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. and yes, synthetic redactions is a new concept introduced in this MSC - i.e. an event which is faked by the server on behalf of the client.

the sender.
Comment on lines +63 to +68
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neilisfragile says

I'm confused by the expected behaviour of m.self_destruct.

In the section on server side behaviour, it seems like the server should redact the event on a per user basis based on when it receives a read receipt from that user. So the server ends up performing multiple synthetic redactions for a given event.

Once a given server has received a read receipt for this message from a member in the room (other than the sender), then the message's self-destruct timer should be started for that user. Once the timer is complete, the server should redact the event from that member's perspective, and send the user a synthetic m.redaction event in the room to the reader's clients on behalf of the sender.

but in the Proposal section it says

m.self_destruct: the duration in milliseconds after which the participating servers should redact this event on behalf of the sender, after seeing an explicit read receipt delivered for the message from all users in the room.

Which is something quite different, in that it if a given user never sees the message or their client does not send a read receipt the event will never be synthetically redacted.

The former seem preferable to me, thinking in terms of how this sort of feature works in other services. Separately it will be much hard to implement the latter in a robust manner.

@ara4n what was your intention?

...

After irl chat we agreed that expiry messages from the client's perspective is the right way forward. We could even consider not sending the synthetic redaction and trust the client to handle it, though my sense is that we should.

So from the previous comment we will implement this:-

Once a given server has received a read receipt for this message from a member in the room (other than the sender), then the message's self-destruct timer should be started for that user. Once the timer is complete, the server should redact the event from that member's perspective, and send the user a synthetic m.redaction event in the room to the reader's clients on behalf of the sender.


The synthetic redaction event should contain an `m.synthetic: true` flag on
the reaction's content to show the client that it is synthetic and used for
implementing self-destruction rather than actually sent from the claimed
client.

For `m.self_destruct_after`, the server should redact the event and send a
synthetic redaction once the server's localtime overtakes the timestamp given
by `m.self_destruct_after`. The server should only perform the redaction once.

ara4n marked this conversation as resolved.
Show resolved Hide resolved
## Client-side behaviour

Clients should display self-destructing events in a clearly distinguished
manner in the timeline. Details of the lifespan can be shown on demand
however, although a visible countdown is recommended.

Clients should locally remove `m.self_destruct` events as if they have been
redacted N milliseconds after first attempting to send the read receipt for the
message in question. The synthetic redaction event sent by the local server
then acts as a fallback for clients which fail to implement special UI for
self-destructing messages.

Clients should locally remove `m.self_destruct_after` events when the local
timestamp exceeds the timestamp indicated in the `m.self_destruct_after`
field.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME: Not only should clients delete the events, but also the megolm session for that event (implying that they should be sent in their own megolm session) - otherwise an attacker who grabs the megolm keys off the endpoint can still decrypt them off the server.

Clients should warn the sender that self-destruction is based entirely on good
faith, and other servers and clients cannot be guaranteed to uphold it.
Typical text could be:

"Warning: recipients may not honour disappearing messages".

We recommend exposing the feature in UX as "disappearing messages" rather than
"self-destructing messages", as self-destruction implies a reliability and
permenance that the feature does not in practice provide.

Other possible user-friendly wording might include:
* Ephemeral messages (which feels even less reliably destructive than
'disappearing', but may be too obscure a word for a wide audience)
* Vanishing messages (which implies they reliably vanish)
* Vanishable messages (which is clunky, but more implies that it's an intent
rather than a guarantee)
* Evanescent messages (which is too obscure, but implies that it's a message
which /can/ vanish, rather than that it /will/)
* Fleeting messages (which is again quite an obscure word)
* Transient messages (too techie)
* Temporary messages (could work)
* Short-term messages

## Threat model

Any proposals around coordinating deletion of data (e.g. this,
[MSC1763](https://github.com/matrix-org/matrix-doc/issues/1763),
[MSC2278](https://github.com/matrix-org/matrix-doc/issues/2278)) are sensitive
because there is of course no way to stop a malicious server or client or user
ignoring the deletion and somehow retaining the data. (We consider any attempt
at trying to use DRM to do so as futile).

This MSC is intended to:

* Give a strong commitment to users on trusted clients and servers that
message data will not be persisted on Matrix servers and clients beyond the
requested timeframe. This is useful for legal purposes in rooms which span
only trusted (e.g. private federated) servers, to enforce data retention
behaviour.

* Give a weak commitment to users on untrusted clients and servers (e.g.
arbitary users on the public Matrix network) that message data may not be
persisted beyond the requested timeframe. This should be adequate for
unimportant disappearing messages (e.g. a casual fleeting message which is so
unimportant or of such shortlived relevance that it is not worthy of being put
in the timeline). It is **not** adequate for attempting to ensure that
sensitive content is deleted after reading for legal or security purposes.

* Help server admins manage diskspace by letting users dictate retention
lifetime per message.

## Tradeoffs

We could purge rather than redact destructed messages from the DB, but that
would fragment the DAG so we don't do that.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that P2P Matrix is driving work to support fragmented DAGs, and it may not be so unreasonable to just delete the message outright once that work lands. However, I wouldn't create an artificial dependency on that at this point; instead we might just need another MSC in future to make self-destructing events simply delete rather than self-redact.


We could have the sending server send an explicit redaction event on behalf of
the sender rather than synthesise a redaction on the various participating
servers. However, this would clog up the DAG with a redundant event, and also
introduce unreliability if the sending server is unavailable or delayed. It
would also result in all users redacting the message the same time. Therefore
synthetic per-user redaction events (which are only for backwards
compatibility anyway) feel like the lesser evil.

## Issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about compatibility? What if remote servers don't implement this protocol? Should the sender (or the sender's homeserver if the sender can verify that it supports self-destruction) send a "regular" redaction event (likely specially marked) so that self-destruction should work as long as redaction is supported? (IIUC redaction support is a dependency of this MSC so should be strictly more common.)

It would be good to have provisions for this or include it in "Security considerations"


We should probably ignore missing read receipts from bots when deciding
whether to self-destruct. This is blocked on having a good way to identify
ara4n marked this conversation as resolved.
Show resolved Hide resolved
bots. [MSC1206](https://github.com/matrix-org/matrix-doc/pull/1206) provides
one possible way, as does
[MSC2199](https://github.com/matrix-org/matrix-doc/pull/2199) (important v.
unimportant users in immutable DMs).

The behaviour for rooms with more than 2 participants ends up being a bit
strange. The client (and server) starts the expiry countdown on the message as
soon as the participant has read it. This means that someone can look over
the shoulder of another user to see the content again. This is probably a
feature rather than a bug(?)

## Security considerations

There's scope for abuse where users can send obnoxious self-destructing messages
into a room.

One solution for this could be for server implementations to implement a
quarantine mode which initially marks redacted events as quarantined for N days
before deleting them entirely, allowing server admins to address abuse concerns.
This is of course true for redactions in general.

ara4n marked this conversation as resolved.
Show resolved Hide resolved
## Conclusion

This provides a simple and pragmatic way of automating the process of manually
redacting sensitive messages once the recipients have read them.