-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC2265: Proposal for mandating case folding when processing e-mail address localparts #2265
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems sane to me
proposals/2265-email-lowercase.md
Outdated
This proposal suggests changing the specification of the e-mail 3PID type in | ||
[the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types) | ||
to mandate that any e-mail address must be entirely converted to lowercase | ||
before any processing, instead of only its domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how much of complication is to mandate lower-case processing (such as lookup and hashing) but case-preserve storing addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we ultimately can't tell implementations how to store their data (if they want to spend the extra time converting things to uppercase they can), but the requirement for lookups being lowercase is a fairly strong argument imo
Co-Authored-By: Andrew Morgan <[email protected]>
Co-Authored-By: Andrew Morgan <[email protected]>
Co-Authored-By: Andrew Morgan <[email protected]>
Co-Authored-By: Andrew Morgan <[email protected]>
Seems like general consensus overall. @mscbot fcp merge |
Team member @anoadragon453 has proposed to merge this. The next step is review by the rest of the tagged people: No concerns currently listed. Once at least 75% of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So some questions on this of the sort that arise whenever case mapping comes up:
Are you sure that lower-casing, as opposed to casefolding, is the right thing to do? Examples of the difference:
- ß (german lower-case long 's', upper-case equivalent 'SS') case-folds to 'ss', so that 'hans.voß' matches 'HANS.VOSS'. (On the other hand: it's not entirely obvious that they should be treated the same)
- ς (greek lower-case sigma, when used at the end of the word) case-folds to 'σ' (regular lower-case sigma), so that 'ΣΊΣΥΦΟΣ' matches 'σίσυφος'.
Relatedly: should we consider unicode normalisation, so that (for example) 'ê' (U+00EA, e with circumflex) is treated the same as 'ê' (U+0065 U+0302, e followed by circumflex combining character)?
Neither of the above solve the 'French problem' where (traditionally) accents are omitted on upper-case characters, so 'COTE' should be equivalent to 'côté'...
I guess it depends on whether common email providers treat both characters as the same. I'll do some investigation around that.
Should it, though, keeping in mind we're only looking at email addresses here? I just checked on both Gmail and Hotmail and neither of them consider Otherwise, yes, casefold is probably the way to go, I'll update the proposal to reflect that. |
tbh I wasn't aware that gmail let you use non-ascii localparts at all. Certainly being guided by the behaviour of common providers seems like a sensible idea. (also: sorry for not starting a thread.) |
Neither was I. For full context, I've tried Thunderbird (+OVH), Roundcube (+OVH), Hotmail and Gmail and only this last one accepted a non-ascii localpart in the recipient's address. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. |
Merged 🎉 |
Rendered