Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar and disambiguation of display names (SPEC-392) #177

Open
matrixbot opened this issue Apr 19, 2016 · 14 comments
Open

Grammar and disambiguation of display names (SPEC-392) #177

matrixbot opened this issue Apr 19, 2016 · 14 comments
Labels
feature Suggestion for a significant extension which needs considerable consideration

Comments

@matrixbot
Copy link
Member

Both users and rooms can have display names. I think we can apply the same rules to both users and rooms, so I am covering both under this issue.

The emphasis here is on human-readability - they are not really meant for machine interpretation (modulo highlighting of mentions). Display names must therefore support the full gamut of unicode: non-bmp characters, zalgo, RTL scripts, etc.

However, things are complicated by the desire to disambiguate users. Currently, if we have two "Matthew"s in a room, we disambiguate them by also showing the user id. So the question is: how do we determine if we need to disambiguate users, and can we design the grammar of them to make this easier. For instance: do we allow empty displaynames, or those containing only whitespace? Do we allow names to start or end with whitespace, or contain sequences of whitespace characters? Do we mandate a certain sort of Unicode normalisation?

(Imported from https://matrix.org/jira/browse/SPEC-392)

(Reported by @richvdh)

@matrixbot
Copy link
Member Author

Jira watchers: @richvdh

@matrixbot
Copy link
Member Author

matrixbot commented Apr 19, 2016

Links exported from Jira:

relates to https://github.com/matrix-org/matrix-doc/issues/538
relates to SPEC-1

@matrixbot
Copy link
Member Author

matrixbot commented Apr 19, 2016

See also https://github.com/matrix-org/matrix-doc/issues/538, which discusses what we can do once we figure out that two displaynames are ambiguous.

-- @richvdh

@matrixbot
Copy link
Member Author

Should there be a length limit? Would it apply to unicode codepoints, normalised unicode codepoints, or composed characters?

-- @richvdh

@matrixbot
Copy link
Member Author

"Modulo highlighting of mentions" - this should probably be done via Unicode-spec-compliant collation, so as to avoid problems regarding composed/decomposed/etc, but that opens a whole new can of worms due to collation being locale-dependent :(

-- Alex Elsayed

@matrixbot
Copy link
Member Author

matrixbot commented Apr 19, 2016

Also, porting over a comment from SPEC-1:

A similar ticket for display names may need opened separately, but today in #matrix:matrix.org it was discovered that display names permit whitespace that they probably shouldn't (a trailing \n\t\t on a display name caused some confusion).

Some whitespace cannot be blocked - the zero-width joiner, in particular, is needed to construct the letterforms of some languages that Unicode does not support sufficiently. In addition, spaces are widely used, and blocking those would likely cause widespread breakage. However, it is entirely possible that all other whitespace can be banned without detrimental effect.

I stand by this - the vast majority of whitespace is not meaningful in what is essentially a "line-like" value, and has empirically caused real confusion when accidentally inserted.

-- Alex Elsayed

@matrixbot
Copy link
Member Author

I'm reluctant to base the requirements for this bug around mentions - as you have pointed out it is thorny due to being locale-dependent. I would rather we had richer interface for mentions (I quite like the way jira does it, actually), but that is a topic for a separate bug.

(a trailing \n\t\t on a display name caused some confusion)

[~eternaleye]: Can you enlarge on this? was it simply that someone was not getting notifications when they might have expected it, because the name didn't match?

-- @richvdh

@matrixbot
Copy link
Member Author

Nothing to do with mentions - more that, in essentially any case where a display name is presented, vertical whitespace is entirely nonsensical (and in the existing clients, that case was mistaken for a bug in either rendering or markdown formatting for quite a while before it was figured out).

In essence, vertical whitespace makes sense only in documents - and the same is true of most horizontal whitespace, such as tab characters. As a display name is essentially a line-oriented context that gets embedded in a document-like context, allowing them is the visual equivalent of breaking encapsulation.

It consistently confuses the humans reading it, not the computers matching against it.

-- Alex Elsayed

@matrixbot matrixbot changed the title Grammar and disambiguation of display names Grammar and disambiguation of display names (SPEC-392) Oct 31, 2016
@matrixbot matrixbot added the feature Suggestion for a significant extension which needs considerable consideration label Nov 7, 2016
@eternaleye
Copy link

The same issue I reported in https://github.com/matrix-org/matrix-doc/issues/669#issuecomment-256942216 and https://github.com/matrix-org/matrix-doc/issues/669#issuecomment-256942218 seems to also affect the displayed names of rooms - this was brought to my attention today by there being a large number of newlines in the name of #offtopic:matrix.org, which caused it to display as very tall indeed in my sidebar on Quaternion 😛

@ara4n
Copy link
Member

ara4n commented Jan 11, 2018

Lots of discussion of this (which almost got lost in the mists of time) over at: matrix-org/matrix-spec-proposals#3 (comment)

@dkasak
Copy link
Member

dkasak commented Mar 7, 2018

There is another problem with the current disambiguation rules for display names which I consider rather serious from a security perspective.

If I understood the spec (11.2.2.3) right, it basically amounts to using the display name if it exists and using the raw user ID if not, unless there is a conflict between display names, in which case both display names should be disambiguated using the raw user ID.

However, consider the following case:

  • @user1:example.com has display name Foo
  • @user2:example.com has display name @user1:example.com

On a first glance, it is natural to assume that user @user1:example.com has no display name set and hence gets its raw user ID displayed, while in reality this is @user1:example.com pretending to be @user2:example.com. The spec says nothing about this highly confusing case and it is currently allowed by at least the web and Android versions of Riot. Furthermore, as I was experimenting with setting my display name to a friend's raw user ID, I actually got myself confused for a moment in some situations, even though I knew what I was doing, because the problem is so pervasive. For instance, in Riot, there is no visual difference between kicking him (who has no display name set) out of the room and kicking myself out (with the display name set to impersonate him). In fact, Riot displays the same name for the both of us in the member list (!). Also, log excerpts I received via email were addressed to his raw user ID (i.e. my display name).

I propose that the server should reject attempts to set display names containing valid Matrix user IDs. It seems that allowing them in display names has little overall benefit, but is a great potential source of confusion.

@turt2live
Copy link
Member

Note: In development versions of Riot this is no longer an issue due to matrix-org/matrix-js-sdk#588 (although imho it can be overzealous: element-hq/element-web#5914 )

Example (@voyager:t2bot.io is a bot of mine in the room):
image

@dkasak
Copy link
Member

dkasak commented Mar 8, 2018

@turt2live Yeah, I found that after I had already posted this. However, the disambiguation still looks rather ambiguous to me, particularly if I try to view it from the perspective of someone new to Matrix or a more casual computer user.

Having a better visual distinction than it just being parenthesized text would help a bit, but then there's the problem of each client having to find its own solution to this problem, which isn't an easy task at all. It seems to be there should be some ground truth (which is what mxids are) which should be guarded strictly, hence why I suggested not allowing mxids in display names server-side.

@turt2live turt2live self-assigned this Sep 6, 2018
@turt2live turt2live removed their assignment Sep 14, 2018
@richvdh
Copy link
Member

richvdh commented Jan 29, 2020

interesting to note that riot-web have worked around much of this with the use of unhomoglyph. We need to decide whether to mandate the same for other clients.

@richvdh richvdh transferred this issue from matrix-org/matrix-spec-proposals Mar 1, 2022
@turt2live turt2live mentioned this issue Mar 1, 2022
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Suggestion for a significant extension which needs considerable consideration
Projects
None yet
Development

No branches or pull requests

6 participants