Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2265: Proposal for mandating case folding when processing e-mail address localparts #2265

Merged
merged 10 commits into from
Jun 7, 2020

Conversation

babolivier
Copy link
Contributor

@babolivier babolivier commented Aug 30, 2019

@babolivier babolivier added proposal-in-review proposal A matrix spec change proposal labels Aug 30, 2019
@babolivier babolivier changed the title Proposal for mandating lowercasing when processing e-mail address localparts MSC2265: Proposal for mandating lowercasing when processing e-mail address localparts Aug 30, 2019
@turt2live turt2live self-requested a review August 30, 2019 14:21
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems sane to me

This proposal suggests changing the specification of the e-mail 3PID type in
[the Matrix spec appendices](https://matrix.org/docs/spec/appendices#pid-types)
to mandate that any e-mail address must be entirely converted to lowercase
before any processing, instead of only its domain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much of complication is to mandate lower-case processing (such as lookup and hashing) but case-preserve storing addresses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we ultimately can't tell implementations how to store their data (if they want to spend the extra time converting things to uppercase they can), but the requirement for lookups being lowercase is a fairly strong argument imo

proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
proposals/2265-email-lowercase.md Show resolved Hide resolved
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
proposals/2265-email-lowercase.md Outdated Show resolved Hide resolved
proposals/2265-email-lowercase.md Show resolved Hide resolved
@anoadragon453
Copy link
Member

Seems like general consensus overall.

@mscbot fcp merge

@mscbot
Copy link
Collaborator

mscbot commented Sep 3, 2019

Team member @anoadragon453 has proposed to merge this. The next step is review by the rest of the tagged people:

No concerns currently listed.

Once at least 75% of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@mscbot mscbot added proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. disposition-merge labels Sep 3, 2019
@richvdh richvdh self-requested a review September 10, 2019 10:56
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So some questions on this of the sort that arise whenever case mapping comes up:

Are you sure that lower-casing, as opposed to casefolding, is the right thing to do? Examples of the difference:

  • ß (german lower-case long 's', upper-case equivalent 'SS') case-folds to 'ss', so that 'hans.voß' matches 'HANS.VOSS'. (On the other hand: it's not entirely obvious that they should be treated the same)
  • ς (greek lower-case sigma, when used at the end of the word) case-folds to 'σ' (regular lower-case sigma), so that 'ΣΊΣΥΦΟΣ' matches 'σίσυφος'.

Relatedly: should we consider unicode normalisation, so that (for example) 'ê' (U+00EA, e with circumflex) is treated the same as 'ê' (U+0065 U+0302, e followed by circumflex combining character)?

Neither of the above solve the 'French problem' where (traditionally) accents are omitted on upper-case characters, so 'COTE' should be equivalent to 'côté'...

@babolivier
Copy link
Contributor Author

babolivier commented Sep 10, 2019

Relatedly: should we consider unicode normalisation, so that (for example) 'ê' (U+00EA, e with circumflex) is treated the same as 'ê' (U+0065 U+0302, e followed by circumflex combining character)?

I guess it depends on whether common email providers treat both characters as the same. I'll do some investigation around that.

Neither of the above solve the 'French problem' where (traditionally) accents are omitted on upper-case characters, so 'COTE' should be equivalent to 'côté'...

Should it, though, keeping in mind we're only looking at email addresses here? I just checked on both Gmail and Hotmail and neither of them consider bréndan.abolivier@... as being the same as brendan.abolivier@..., and I'm not aware of any provider that does.

Otherwise, yes, casefold is probably the way to go, I'll update the proposal to reflect that.

@richvdh
Copy link
Member

richvdh commented Sep 10, 2019

tbh I wasn't aware that gmail let you use non-ascii localparts at all. Certainly being guided by the behaviour of common providers seems like a sensible idea. (also: sorry for not starting a thread.)

@babolivier
Copy link
Contributor Author

tbh I wasn't aware that gmail let you use non-ascii localparts at all

Neither was I. For full context, I've tried Thunderbird (+OVH), Roundcube (+OVH), Hotmail and Gmail and only this last one accepted a non-ascii localpart in the recipient's address.

@babolivier babolivier changed the title MSC2265: Proposal for mandating lowercasing when processing e-mail address localparts MSC2265: Proposal for mandating case folding when processing e-mail address localparts Nov 13, 2019
@mscbot
Copy link
Collaborator

mscbot commented Jun 2, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@mscbot mscbot added final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels Jun 2, 2020
@mscbot
Copy link
Collaborator

mscbot commented Jun 7, 2020

The final comment period, with a disposition to merge, as per the review above, is now complete.

@mscbot mscbot added finished-final-comment-period and removed disposition-merge final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. labels Jun 7, 2020
@turt2live turt2live merged commit 34f2d48 into master Jun 7, 2020
@turt2live turt2live added spec-pr-missing Proposal has been implemented and is being used in the wild but hasn't yet been added to the spec and removed finished-final-comment-period labels May 1, 2021
@turt2live turt2live self-assigned this May 1, 2021
turt2live added a commit that referenced this pull request May 2, 2021
@turt2live turt2live added spec-pr-in-review A proposal which has been PR'd against the spec and is in review and removed spec-pr-missing Proposal has been implemented and is being used in the wild but hasn't yet been added to the spec labels May 2, 2021
@turt2live
Copy link
Member

Merged 🎉

@turt2live turt2live added merged A proposal whose PR has merged into the spec! and removed spec-pr-in-review A proposal which has been PR'd against the spec and is in review labels May 3, 2021
richvdh pushed a commit that referenced this pull request Aug 23, 2021
richvdh pushed a commit that referenced this pull request Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:maintenance MSC which clarifies/updates existing spec merged A proposal whose PR has merged into the spec! proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants