Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2228: Self destructing events #2228

Open
wants to merge 15 commits into
base: old_master
Choose a base branch
from
115 changes: 115 additions & 0 deletions proposals/2228-self-destructing-events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Proposal for self-destructing messages

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion on this:

  1. One reason one might want this to exist is to hide every message in a room automatically once it has been read by all participants. Wouldn't it make more sense to have self-destruct settings in the rooms state instead of each event?
  2. Another reason is that it frees up the timeline so you don't get distracted by old messages and focus on the current conversation. This would make more sense client-side though.
  3. If one wants this feature because of performance, other valid options would be to add an option to disable search indexing for a room. Performance won't improve by redacting messages, because the old events are still around and there will be another redaction event for every message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our usecase requires only a subset of messages to expire; hence per-event, rather than per-room.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both approaches have their merits. How about the following: the timers are still per message, but their default values can be configured in the room's state. When a client send a "normal" message, it will use the room's default. When the user sends a message through some special "timed message" UI, it may use a custom timer.

With this, one could have some general data cleanup on a large time scale (e.g. four weeks of backlog) but keep the possibility of sending short-lived images etc.

This needs not be addressed in this MSC though, as it can easily be added afterwards if desired.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just an interested member of the public, but @piegamesde's approach sounds good to me and would meet my needs. It's similar to what Keybase does.


It's useful for users to be able to send sensitive messages within a
conversation which should be removed after the target user(s) has read them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conversation which should be removed after the target user(s) has read them.
conversation which should be removed after the target users have read them.

This can be achieved today by the sender redacting the message after the
receipient(s) have read them, but this is a tedious manual process. This
proposal provides a way of automating this process.

Originally [MSC1763](https://github.com/matrix-org/matrix-doc/pull/1763)
attempted to solve this by applying retention limits on a per-message basis
and purging expired messages from the server; in practice this approach is
flawed because purging messages fragments the DAG, breaking back-pagination
and potentially causing performance problems. Also, the ability to set an
expiration timestamp relative to the send (rather than read) time is not of
obvious value. Therefore the concept of self-destructing messsages was
split out into this independent proposal.

## Proposal
ara4n marked this conversation as resolved.
Show resolved Hide resolved

Users can specify that a message should self-destruct by adding the following
field to any event's content:

`m.self_destruct`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this duration relative to? The original message or the last read receipt? I think since this is moving to per-user we need to update to be relative to the read receipt of each user.

the duration in milliseconds after which the participating servers should
turt2live marked this conversation as resolved.
Show resolved Hide resolved
redact this event on behalf of the sender, after seeing an explicit read
receipt delivered for the message from all users in the room. Must be null
or an integer in range [0, 2<sup>63</sup>-1]. If absent, or null, this
ara4n marked this conversation as resolved.
Show resolved Hide resolved
behaviour does not take effect.

Clients and servers MUST send explicit read receipts per-message for
self-destructing messages (rather than for the most recently read message,
as is the normal operation), so that messages can be destructed as requested.
Comment on lines +36 to +38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stipulation is a bit puzzling to me, because we make the point of trying to paper over lack of support from clients by synthesising redactions.

However, if we require clients to acknowledge each individual self-destructing event, we have already lost backwards compatibility, so there'd be no point in synthesising redactions.

(on a tangent, I'm not sure why this mentions servers sending read receipts — when do servers send read receipts anyway?)

In my opinion we should remove this stipulation and aim for backwards compat (i.e. allow the read receipt to be for the most recently read message, as per usual).


The `m.self_destruct` field is not preserved over redaction (and
ara4n marked this conversation as resolved.
Show resolved Hide resolved
self-destructing messages may be redacted to speed up the self-destruct
process if desired).

E2E encrypted messages must store the `m.self_destruct` field outside of the
encrypted contents of the message, given the server needs to be able to act on
it.

## Server-side behaviour

When a client sends a message with `m.self_destruct` true, the servers
ara4n marked this conversation as resolved.
Show resolved Hide resolved
participating in a room should start monitoring the room for read receipts for
the event in question.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should each homeserver redact the message once all of it's users have read the message? (The sender would have implicitly have ready the message upon sending) Right now it seems that m.self_destruct_after is effectively required if you actually want servers to delete the message at some point. It would be nice if there was an option to only use the read-based destruction.


Once a given server has received a read receipt for this message from a member
in the room (other than the sender), then the message's self-destruct timer
should be started for that user. Once the timer is complete, the server
should redact the event from that member's perspective, and send the user a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sending this synthetic redaction will mean we have to generate an event ID for it. That part is easy, but what happens if the client then tries to use that fake event's ID for anything such as:

  • replying to it
  • /_matrix/client/v3/rooms/{roomId}/context/{eventId}
  • /rooms/{roomId}/event/{eventId}
  • /_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}
  • /rooms/{roomId}/relations/{eventId} (and friends)
  • POST /_matrix/client/v3/rooms/{roomId}/receipt/{receiptType}/{eventId}
  • perhaps some others I missed, but these are all I can see for now

Most of these may be fairly easy to handle but it's still worth noting as extra implementation work.

The idea of synthetic events may still make good sense, but we should be aware of what we're signing up for — it's not quite as simple as just jamming something in /sync. On the Synapse side, after seeing this I would be tempted to implement a generic synthetic event mechanism in case we re-use this idea in the future — do we expect we may do?

synthetic `m.redaction` event in the room to the reader's clients on behalf of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me whether a "synthetic redaction event" (or a synthetic event in general) is a new concept introduced by this MSC, and whether it differs from a standard redaction event other than having an additional m.synthetic: true flag.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
synthetic `m.redaction` event in the room to the reader's clients on behalf of
synthetic `m.room.redaction` event in the room to the reader's clients on behalf of

I can't find any reference for m.redaction in the spec, do you mean m.room.redaction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. and yes, synthetic redactions is a new concept introduced in this MSC - i.e. an event which is faked by the server on behalf of the client.

the sender.
Comment on lines +63 to +68
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neilisfragile says

I'm confused by the expected behaviour of m.self_destruct.

In the section on server side behaviour, it seems like the server should redact the event on a per user basis based on when it receives a read receipt from that user. So the server ends up performing multiple synthetic redactions for a given event.

Once a given server has received a read receipt for this message from a member in the room (other than the sender), then the message's self-destruct timer should be started for that user. Once the timer is complete, the server should redact the event from that member's perspective, and send the user a synthetic m.redaction event in the room to the reader's clients on behalf of the sender.

but in the Proposal section it says

m.self_destruct: the duration in milliseconds after which the participating servers should redact this event on behalf of the sender, after seeing an explicit read receipt delivered for the message from all users in the room.

Which is something quite different, in that it if a given user never sees the message or their client does not send a read receipt the event will never be synthetically redacted.

The former seem preferable to me, thinking in terms of how this sort of feature works in other services. Separately it will be much hard to implement the latter in a robust manner.

@ara4n what was your intention?

...

After irl chat we agreed that expiry messages from the client's perspective is the right way forward. We could even consider not sending the synthetic redaction and trust the client to handle it, though my sense is that we should.

So from the previous comment we will implement this:-

Once a given server has received a read receipt for this message from a member in the room (other than the sender), then the message's self-destruct timer should be started for that user. Once the timer is complete, the server should redact the event from that member's perspective, and send the user a synthetic m.redaction event in the room to the reader's clients on behalf of the sender.


The synthetic redaction event should contain some flag to show the client
that it is synthetic and used for implementing self-destruction rather than
actually sent from the claimed client. Perhaps `m.synthetic: true` on the
redaction's contents?
ara4n marked this conversation as resolved.
Show resolved Hide resolved

ara4n marked this conversation as resolved.
Show resolved Hide resolved
## Client-side behaviour

Clients should display self-destructing events in a clearly distinguished
manner in the timeline. Details of the lifespan can be shown on demand
however, although a visible countdown is recommended.

Clients should locally remove self-destructing events as if they have been
redacted N milliseconds after first attempting to send the read receipt for the
message in question. The synthetic redaction event sent by the local server
then acts as a fallback for clients which fail to implement special UI for
self-destructing messages.

## Tradeoffs

We could purge rather than redact destructed messages from the DB, but that
would fragment the DAG so we don't do that.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that P2P Matrix is driving work to support fragmented DAGs, and it may not be so unreasonable to just delete the message outright once that work lands. However, I wouldn't create an artificial dependency on that at this point; instead we might just need another MSC in future to make self-destructing events simply delete rather than self-redact.


We could have the sending server send an explicit redaction event on behalf of
the sender rather than synthesise a redaction on the various participating
servers. However, this would clog up the DAG with a redundant event, and also
introduce unreliability if the sending server is unavailable or delayed. It
would also result in all users redacting the message the same time. Therefore
synthetic per-user redaction events (which are only for backwards
compatibility anyway) feel like the lesser evil.

We could let the user specify an expiry time for messages relative to when
they were sent rather than when they were read. However, I can't think of a
good enough use case to justify complicating the proposal with that feature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, having the expiry based on the send time rather than the receive time seems more natural to me, as it seems more predictable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be more predictable for the sender, but we should surely consider the UX from the receiver's perspective, where it will just be frustrating to rush to read messages relative to when the sender sent them rather than when the receiver read them. Imagine how annoying it would be to receive a msg with a 5s timeout and discover that 4s of synapse latency meant you only had 1s to open the push, launch the app, sync and read it before your client gleefully deletes it...

That said, there may be a fairly obscure use case here around time-limited promotions - where a user sends a message to a room saying "You have 30 mins from now to download my exclusive artwork!!!" and then posts a link which autodestructs 30 mins after sending.

So perhaps we do want to include this after all.

Copy link

@alangecker alangecker Jun 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ara4n Sounds to me like there is a use case that you may not be aware of: 🙂

The history of a chat should be deleted automatically after x weeks, because it is known that the messages in the chat will no longer be relevant and should rather not be kept forever on all devices, in case an account would be compromised after years. In this case, it is also desirable that the messages are deleted and do not remain stored for example on inactive accounts which haven't sent any read receipts.

In my surrounding a very common practice to set the chats in Signal to 4 weeks to more or less guarantee that not unnecessary much of the chat history is stored on all devices :)

We can extend if/when that use case emerges.

## Issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about compatibility? What if remote servers don't implement this protocol? Should the sender (or the sender's homeserver if the sender can verify that it supports self-destruction) send a "regular" redaction event (likely specially marked) so that self-destruction should work as long as redaction is supported? (IIUC redaction support is a dependency of this MSC so should be strictly more common.)

It would be good to have provisions for this or include it in "Security considerations"


We should probably ignore missing read receipts from bots when deciding
whether to self-destruct. This is blocked on having a good way to identify
ara4n marked this conversation as resolved.
Show resolved Hide resolved
bots.

The behaviour for rooms with more than 2 participants ends up being a bit
strange. The client (and server) starts the expiry countdown on the message as
soon as the participant has read it. This means that someone can look over
the shoulder of another user to see the content again. This is probably a
feature rather than a bug(?)

## Security considerations

There's scope for abuse where users can send obnoxious self-destructing messages
into a room.

One solution for this could be for server implementations to implement a
quarantine mode which initially marks redacted events as quarantined for N days
before deleting them entirely, allowing server admins to address abuse concerns.
This is of course true for redactions in general.

ara4n marked this conversation as resolved.
Show resolved Hide resolved
## Conclusion

This provides a simple and pragmatic way of automating the process of manually
redacting sensitive messages once the recipients have read them.