Attaching data to messages #93

kegsay · 2015-10-12T14:37:41Z

The discussion on this has progressed and this document is now superseded by this Google Doc

This is a straw-man proposal detailing how data can be attached to m.room.messages to provide a richer experience for knowledgeable clients. I expect this will change a lot, but is being published now to get some feedback.

Rendered: https://github.com/matrix-org/matrix-doc/blob/data-messages/drafts/m-room-message-data.rst

illicitonion · 2015-10-12T14:52:58Z

drafts/m-room-message-data.rst

+  type: "m.room.message",
+  content: {
+    msgtype: "m.text",
+    body: "[matrix-org/matrix-ios-sdk] manuroe pushed 4 commits to develop",


I'm not sure this is expressive enough to be generally useful. For sure I can see uses, but I think restricting to one entity for the message is limiting.

Slack's API I think has this down pretty well; anything which has more context will be in a <> tag, and the context is on the left of a pipe, e.g.

"Just FYI, <@U1234|manuroe> pushed some commits, yo"

would turn manuroe into a link to the user with ID @U1234, but if your client doesn't understand, it would just print:

"Just FYI, manuroe pushed some commits, yo"

I could imagine having a list of entities in the data or context key, and being able to refer to them by index, e.g.

content: { body: "Just FYI, manuroe pushed 4 commits to develop", interpolated_body: "Just FYI, <#1|manuroe> pushed 4 commits to <#2|develop>", data: { entities: [ { "domain": "github.com", "user": "manuroe", "link": "https://github.com/manuroe", }, { "domain": "github.com", "repo": "matrix-org/matrix-ios-sdk", "branch": "develop", "link": "https://github.com/matrix-org/matrix-ios-sdk/tree/develop", }, ] } }

But I think limiting to a top-level context is slightly limiting.

I am not sure how I feel about by default having something which is always interpretable in body, or requiring a separate interpolated_body or something...

I like your idea.

+1
Allowing multiple entities and allowing them to associate with specific parts of the message looks very nice - e.g. a Jenkins failure report relating to a git commit would probably have three or four different linkable entities.

ara4n · 2015-10-13T00:02:30Z

I'm a bit confused by this one. Isn't it more idiomatic to just send a custom event type, but with a human-readable body field (and mandate that clients should always display the 'body' param of events they don't recognise)? So the original example here would be something more like:

type: "org.matrix.git.commits",
content: {
body: "[matrix-org/matrix-ios-sdk] manuroe pushed 4 commits to develop",
"entity": "manuroe",
"commits": ["fe34764", "4cdd8ae", "528da705", "56bfc717"]
"link": "matrix-org/matrix-ios-sdk@56bfc717",
"uri": "https://github.com/matrix-org/matrix-ios-sdk.git"
}

Either way, it feels particularly werid that the 'domain' of the context wouldn't be an m.* or org.matrix.* style namespace.

Dan's alternative slack-inspired suggestion is interesting, in that I guess it's effectively a richer hyperlink representation. Again, don't see why "domain" param is "github.com" rather than an com.github.* style namespace for the entity. The only thing that feels a little weird is inventing an entirely new hypertext markup language with <foo|bar> style representation for these links (and cargoculting random antics from Slack). What's wrong with HTML? manuroe etc?

Otherwise LGTM, although given nothing is screaming out for this i'd rather we prioritised features & bugs on the critical path. I'm also not sure what our policy should really be on landing vapourware spec stuff; I quite like the idea of keeping the spec fairly strictly tracking what we actually have implemented in the reference implementations rather than diverging off into scifi. (This is obviously fine to keep in drafts tho, which is generally scifi HQ! :)

kegsay · 2015-10-13T08:31:52Z

Isn't it more idiomatic to just send a custom event type, but with a human-readable body field (and mandate that clients should always display the 'body' param of events they don't recognise)?

It is not more idiomatic to do this. Clients everywhere just switch on the event type. If they don't recognise it, they drop it on the floor. m.room.message mandates that you should display content.body if you don't recognise the msgtype. It would be disastrous if you wanted every client to display a body for every event and enforced such a key on the HS because it just doesn't make any sense. For example, I don't do VoIP, so I don't know any m.call.* events, so you want me to display a body for things like candidates? What about read receipts? State events?

Either way, it feels particularly werid that the 'domain' of the context wouldn't be an m.* or org.matrix.* style namespace.

We do org.matrix.*-style namings to provide namespaces to avoid conflicts between events. The intention with domain is not to provide namespacing but to provide a hint as to what the data relates to e.g. my client knows how to talk to the Github API, so if I see domain: github.com I can potentially use this information to get the owner/repo, Github username, etc.

What's wrong with HTML?

Parsing freeform HTML is at best a pain and extremely error-prone and at worst impossible to do. Dan's list of entities makes parsing much easier.

nothing is screaming out for this

The moment we want to do anything more complicated than displaying text and images we're going to bump into this problem. Given how quick it was to write, I don't see the problem in asking for feedback from people.

I'm also not sure what our policy should really be on landing vapourware spec stuff; I quite like the idea of keeping the spec fairly strictly tracking what we actually have implemented in the reference implementations rather than diverging off into scifi.

This is precisely why this is in /drafts :)

ara4n · 2015-10-13T09:39:55Z

It is not more idiomatic to do this. Clients everywhere just switch on the event type. If they don't recognise it, they drop it on the floor. m.room.message mandates that you should display content.body if you don't recognise the msgtype.

I thought the plan was to support content.body fallback on all events for this purpose.

It would be disastrous if you wanted every client to display a body for every event and enforced such a key on the HS because it just doesn't make any sense. For example, I don't do VoIP, so I don't know any m.call.* events, so you want me to display a body for things like candidates? What about read receipts? State events?

Obviously you wouldn't put a content.body on events which make no sense to display as text messages in a timeline.

Either way, it feels particularly werid that the 'domain' of the context wouldn't be an m.* or org.matrix.* style namespace.

We do org.matrix.*-style namings to provide namespaces to avoid conflicts between events. The intention with domain is not to provide namespacing but to provide a hint as to what the data relates to e.g. my client knows how to talk to the Github API, so if I see domain: github.com I can potentially use this information to get the owner/repo, Github username, etc.

Why wouldn't you namespace these hints?

What's wrong with HTML?

Parsing freeform HTML is at best a pain and extremely error-prone and at worst impossible to do. Dan's list of entities makes parsing much easier.

Fair point.

nothing is screaming out for this

The moment we want to do anything more complicated than displaying text and images we're going to bump into this problem. Given how quick it was to write, I don't see the problem in asking for feedback from people.

Not complaining about it being written at all :) just trying to give review.

I'm also not sure what our policy should really be on landing vapourware spec stuff; I quite like the idea of keeping the spec fairly strictly tracking what we actually have implemented in the reference implementations rather than diverging off into scifi.

This is precisely why this is in /drafts :)

yay!

kegsay · 2015-10-13T10:40:05Z

I thought the plan was to support content.body fallback on all events for this purpose.

I don't recall anyone suggesting this (just m.room.message events).

Obviously you wouldn't put a content.body on events which make no sense to display as text messages in a timeline.

I 100% agree. But that is the crux of the problem. You don't know it is a dumb idea to try to display m.call.candidates but it does make sense to display org.matrix.git.commits. That is why it only makes sense to apply body fallbacks on "things which make sense to be displayed in a timeline" which is m.room.message events.

Why wouldn't you namespace these hints?

I feel namespacing them as com.github would be acting as a hint to the developer to say "make this unique to your client to avoid clashes" just like they are already doing with event types. We do not want to convey this hint because we want people to be able to do if (domain == "github.com") then ... which obviously doesn't work if everyone has namespaced them to be myapp.com.github or com.github.owner.repo

erikjohnston · 2015-10-15T09:46:47Z

I feel namespacing them as com.github would be acting as a hint to the developer to say "make this unique to your client to avoid clashes" just like they are already doing with event types. We do not want to convey this hint because we want people to be able to do if (domain == "github.com") then ... which obviously doesn't work if everyone has namespaced them to be myapp.com.github or com.github.owner.repo

How do I, as a client, interpret the metadata? Do I have to go if domain == github.com and then know what keys that are likely to be there? How do I differentiate between metadata about a commit and metadata about a PR? Do I have to look at what keys are there?

The reason for namespacing event types and msgtypes is to allow clients to know what keys to expect and what they mean. It also ensures that if two different developers try and implement the same thing (e.g. HTML msgtype) they won't both clash and confuse clients. Take your github.com example, what happens if two different developers try to come up with it at the same time, but using different key names?

How do you expect clients to use the metadata? How will it be displayed? How much does it conflict with different msgtype?

eternaleye · 2015-11-07T01:44:57Z

One thing I wonder is whether some variety of RDF - be it JSON-LD or something else - might be better for metadata. In particular, the namespacing concern falls out quite naturally by way of subjects, objects, and predicates being, well, URIs (or blank nodes, but whatever).

kegsay · 2016-06-07T13:37:01Z

The discussion on this has progressed and this document is now superseded by https://docs.google.com/document/d/1l08DL_F_CHo1pIORXzcqWLlZQpztogMFUTYmxd4Gr5s

eternaleye · 2016-06-07T15:51:48Z

That document seems to cover the "formatting" side of this ticket, but not the "auxiliary semantic data" side. Should this be reopened, a new ticket opened for that alone, ...?

kegsay · 2016-06-10T10:55:36Z

@eternaleye : It is covered via:

We add a data key to m.room.message’s content to store arbitrary metadata about a displayable event, to clearly separate it from the rest of the keys in content.

We do not specify standard keys which will be present in content.data yet if that's what you mean. Without being clear on what set of interop we want on the data side, I feel it won't progress any more than this existing issue.

kegsay added 2 commits October 12, 2015 10:02

WIP data on messages

e866871

Add motivation; more rationale

5d3da13

kegsay added addition labels Oct 12, 2015

kegsay assigned illicitonion Oct 12, 2015

Update m-room-message-data.rst

d473cee

illicitonion reviewed Oct 12, 2015
View reviewed changes

ara4n mentioned this pull request Oct 13, 2015

Proposal for human ID rules. #3

Merged

kegsay mentioned this pull request Oct 15, 2015

Straw-man HTML messages proposal #92

Closed

illicitonion assigned kegsay and unassigned illicitonion Oct 19, 2015

aviraldg mentioned this pull request Mar 31, 2016

Emoji are very platform dependent currently element-hq/element-web#1031

Closed

kegsay closed this Jun 7, 2016

ara4n mentioned this pull request Nov 11, 2017

Creating a new workflow for spec work is confusing and counterproductive. eventually-matrix/eventually-doc#6

Open

ara4n mentioned this pull request May 14, 2018

Extensible event types & fallback in Matrix #1225

Closed

Half-Shot mentioned this pull request Mar 1, 2022

Slack Event API Support v2 matrix-org/matrix-appservice-slack#89

Merged

blaggacao mentioned this pull request Mar 1, 2022

[RFC 0094] Use Matrix for Official Chat NixOS/rfcs#94

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attaching data to messages #93

Attaching data to messages #93

kegsay commented Oct 12, 2015 •

edited

Loading

illicitonion Oct 12, 2015

kegsay Oct 12, 2015

leonerd Oct 12, 2015

ara4n commented Oct 13, 2015

kegsay commented Oct 13, 2015

ara4n commented Oct 13, 2015

kegsay commented Oct 13, 2015

erikjohnston commented Oct 15, 2015

eternaleye commented Nov 7, 2015

kegsay commented Jun 7, 2016

eternaleye commented Jun 7, 2016

kegsay commented Jun 10, 2016

Attaching data to messages #93

Attaching data to messages #93

Conversation

kegsay commented Oct 12, 2015 • edited Loading

illicitonion Oct 12, 2015

Choose a reason for hiding this comment

kegsay Oct 12, 2015

Choose a reason for hiding this comment

leonerd Oct 12, 2015

Choose a reason for hiding this comment

ara4n commented Oct 13, 2015

kegsay commented Oct 13, 2015

ara4n commented Oct 13, 2015

kegsay commented Oct 13, 2015

erikjohnston commented Oct 15, 2015

eternaleye commented Nov 7, 2015

kegsay commented Jun 7, 2016

eternaleye commented Jun 7, 2016

kegsay commented Jun 10, 2016

kegsay commented Oct 12, 2015 •

edited

Loading