Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TryParse for email addresses #907

Closed
Grauenwolf opened this issue Nov 16, 2017 · 47 comments · Fixed by #1052
Closed

TryParse for email addresses #907

Grauenwolf opened this issue Nov 16, 2017 · 47 comments · Fixed by #1052
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Net
Milestone

Comments

@Grauenwolf
Copy link

Currently we have to write this code:

        public static bool IsEmailAddress(this string value)
        {
            try
            {
                new System.Net.Mail.MailAddress(value);
                return true;
            }
            catch ()
            {
                return false;
            }
        }

MailAddress should have a TryParse method like Int32.TryParse.

Rationale and Usage

Throwing and catching exceptions can have a significant impact on performance. So when possible there should be a non-throwing alternative for methods that commonly fail. An example would be Int32.Parse and Int32.TryParse.

There is also an impact on debugger. If you have the IDE set to "Break on All Exceptions" in order to see where an exception is being thrown, it will also catch this even though it is not interesting.

Proposed API

class MailAddress {
    public static MailAddress Parse(string address);
    public static MailAddress Parse(string address, string displayName);
    public static MailAddress Parse(string address, string displayName, Encoding displayNameEncoding);

    public static bool TryParse(string address, out MailAddress result);
    public static bool TryParse(string address, string displayName, out MailAddress result);
    public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);
}

Details

Open Questions

@karelz
Copy link
Member

karelz commented Nov 16, 2017

Can you post full API proposal? (see API review process)

@Grauenwolf
Copy link
Author

How's that look?

@karelz
Copy link
Member

karelz commented Nov 16, 2017

See the link I posted - it has details and also example. The key thing is that API should be thought through various points of views (usage patterns, similar patterns in BCL, etc.)

Ah, I didn't realize you updated top post.

@Grauenwolf
Copy link
Author

My apologies, I see how my last comment was a bit ambiguous.

@euclid47
Copy link

@Grauenwolf Have you tried writing an extension class to accomplish this task?

@ctolkien
Copy link

Alternatively, why not make the MailAddressParser which the MailAddress uses under the covers not internal?

https://github.com/dotnet/corefx/blob/f0ccd9742065ee2aa296448fe3dd38ef6ff77af9/src/Common/src/System/Net/Mail/MailAddressParser.cs#L21

@Grauenwolf
Copy link
Author

As long as it has a TryParse method, I have no preference.

@satano
Copy link
Member

satano commented Nov 16, 2017

@euclid47
Own extension is not enough, because it will still have the issues @Grauenwolf mentioned in his rationale: impact on performance because of thrown exception and troubles in IDE when you use Break on All Exceptions. The other disadvantage of extension is, that you have to have an instance already and this should be static methods as shown in proposed API.

@ctolkien
This probably will not work so easily. My opinion is, that the implementation of TryParse should not throw any exceptions even internally because od performance (of course arguments checking and thus some Argument[Null]Exception is OK).

I agree with @Grauenwolf that TryParse would be a very nice method to have. We also use the try-catch block just to check if string is valid email.

@blowdart
Copy link
Contributor

The only way to discover if an email is valid is to send an email to it and see what happens. Just because it looks like an email address does not mean it's valid. When you TryParse on a number the result indicates that the value parsed is, or is not, a number. TryParse for email addresses cannot do this.

As people make security decisions on email addresses I would not like to see this added.

(As an aside I wish System.Net.Mail would just go away).

@ctolkien
Copy link

ctolkien commented Nov 16, 2017

We also use the try-catch block just to check if string is valid email.

My experience is that the only real way to check if an email is valid is to send an email to it.

Edit: gah beaten.

@Grauenwolf
Copy link
Author

That's like arguing we shouldn't parse someone's date of birth because the only way to know if the DOB is valid is to check the birth certificate.

If using for UI validation, you shouldn't wait for a bounce to inform the user they typed "me#google.com".

Personally I'm not even using it for sending emails. I'm using it to extract the user and host name as separate fields for LDAP lookups. (Which also proves the email address is valid without sending an email.)

@Grauenwolf
Copy link
Author

On another project I need to read the top 10,000 rows from each table and see which non-null columns always, never, or sometimes contain email addresses.

Again, I need an email address parser. But I'm not going to be actually sending out emails.


Finally, how would you use an email address alone for security? Even if you did send a validation email, it could be to a fake address set up for that purpose.

@WiredUK
Copy link

WiredUK commented Nov 17, 2017

You were given an email parser in your Stack Overflow question, so what more do you need really? I'm in agreement with @blowdart here, not keen on adding extra APIs to parse email addresses, especially as the rules for what a valid address has the potential to change, meaning the API would go out of date.

@Grauenwolf
Copy link
Author

I fail to see the logic in your argument. Class MailAddress isn't going away. In the unlikely event that the definition of an email address changes, it will have to be updated anyways.

And no, a Regex is not a substitute for a standards compliant parser.

@euclid47
Copy link

@Grauenwolf For both questions I would ask these on stackoverflow.com due to being off topic. Both questions are important and I have a few ideas to accomplish both tasks.

@satano
Copy link
Member

satano commented Nov 17, 2017

@blowdart, @ctolkien, @WiredUK

When you TryParse on a number the result indicates that the value parsed is, or is not, a number. TryParse for email addresses cannot do this.

It can and it definitelly would. You mix two different things here. One is a check if some string represents an email address, the other is if that email address really exists (or is valid as you say). ˙TryParse˙ is about the first one. The same is with int.TryParse - it just checks if string is a nuber, but does not check if that number is valid in given context (day of birth example is a good one).

The other thing is - if you even send an email to that address, how can you be sure? What if the email server is down? Do you assume in this case that email is invalid?

especially as the rules for what a valid address has the potential to change, meaning the API would go out of date.

It is quite unlikely that the format of email address will change, but if it happens, this API will not change. It still will have the same input and output. What would really change is that regex, because it will need to adapt to new format.

@satano
Copy link
Member

satano commented Nov 17, 2017

To put it in simpler way MailAddress.TryParse methods ar not for checking if email is valid, but for checking if given string is in a format, which represents email address.

@satano
Copy link
Member

satano commented Nov 17, 2017

@euclid47

For both questions I would ask these on stackoverflow.com due to being off topic. Both questions are important and I have a few ideas to accomplish both tasks.

This is not about searching for some workaround, but it is an API proposal for .NET core. So this is the place where to ask/propose. If you have some ideas, why not share them?

@euclid47
Copy link

@satano The two questions about searching through large datasets and using an email address as the only method of security. Both are valid questions but off topic for the original feature request. Those are best suited for stackoverflow.com.

@satano
Copy link
Member

satano commented Nov 17, 2017

@euclid47 OK. Thanks.

@madelson
Copy link
Contributor

I have several projects where this would be useful (today I use the try-catch hack).

In my mind this would be very similar to the Uri.TryCreate method which already exists. The idea is to validate the structure, which is then exposed via MailAddress properties like Host.

Like with Uri, this of course cannot validate whether the address points to an actual email account.

@jnm2
Copy link
Contributor

jnm2 commented Dec 29, 2017

I'd be okay with this so long as it allows just about any combination of special characters, as valid email addresses do. I'd be hard-pressed to find a reason to go further than checking something like \S\s*@\s*\S (at least one non-whitespace character on each side of the '@'). Even better (thanks @tannergooding):

bool IsValidEmailAddress(string candidate)
{
    if (candidate == null) return false;
    var trimmed = candidate.Trim();
    return trimmed.Length >= 3 && trimmed.IndexOf('@', 1, trimmed.Length - 2) != -1;
}

@madelson
Copy link
Contributor

@jnm2 presumably this would run the exact same validation as a what the MailAddress constructor already does.

@truencoa
Copy link

truencoa commented Apr 24, 2018

using System.ComponentModel.DataAnnotations;

public bool IsValidEmail(string source)
{
    return new EmailAddressAttribute().IsValid(source);
}

Can't someone just port this over?

@madelson
Copy link
Contributor

@truencoa interesting. Looks like EmailAddressAttribute uses a regex under the hood, which seems less robust than MailAddress's custom parser (perhaps not, though).

@truencoa
Copy link

truencoa commented Apr 25, 2018

well, we tried using URI because that's just simpler, but couldn't figure out a way to parse it with and have our fail tests pass:

string emailAddress = "manythings@support@microsoft. com";
 Uri uri;
 if (Uri.TryCreate(Uri.UriSchemeMailto + Uri.SchemeDelimiter + emailAddress, UriKind.Absolute, out uri))
 {
   MailAddress mailAddress = new MailAddress(emailAddress);
 }

Uris are simply too accepting - especially with mailtos.

@petermorlion
Copy link

Regardless of implementation, I agree this would be a nice addition to the API.

First, there is the performance impact. Though probably minimal on today's machines, in a world of micro/nano-services, focus on milliseconds, and the .NET team's attention to performance in general, I feel this is an argument pro.

But there is also the fact that it would be easily discoverable. int.TryParse is well-known, even among beginning .NET programmers. Now, many developers would search online and probably find some regex that doesn't really match everything that is acceptable according to the standard (RFC 2822).

This makes it a good addition to the API in my opinion.

@karelz
Copy link
Member

karelz commented Oct 2, 2019

Open question: Should we add also Parse methods when we have constructors?

@petermorlion
Copy link

My opinion: that might be good because it follows the structure of int and other types, in that a Parse and TryParse method go hand in hand. Internally, you could of course point them to the constructor.
But that's just my opinion and I have little experience with framework and/or API design. I'm curious to read what others think.

@Grauenwolf
Copy link
Author

Consistency is important.

And it's easier to use static functions as delegates than constructors.

@elachlan
Copy link
Contributor

elachlan commented Oct 3, 2019

My current use case is similar to @Grauenwolf. I am validating a textbox value as its entered. So for each character entered before the email is valid, I get an exception. If I am debugging with break on exception, then it becomes a bit problematic to test.

@karelz
Copy link
Member

karelz commented Oct 4, 2019

That's a good use case, thanks!

@scalablecory
Copy link
Contributor

My opinion: that might be good because it follows the structure of int and other types, in that a Parse and TryParse method go hand in hand. Internally, you could of course point them to the constructor.
But that's just my opinion and I have little experience with framework and/or API design. I'm curious to read what others think.

I agree, this is a de-facto pattern established throughout the library and I'd prefer to preserve that. See Guid for an example of having both a parsing-ctor, a Parse, and a TryParse.

@terrajobst
Copy link
Member

terrajobst commented Oct 22, 2019

Video

  • If we were to expose the TryParse ones, we should also define corresponding static Parse methods.
  • The bigger question is whether we'd recommend MailAddress at all. The data annotations implementation for IsValid is much simpler, probably because email addresses are complex and in practice most validators reject rare, but valid email addreses.

@scalablecory
Copy link
Contributor

scalablecory commented Nov 12, 2019

triage: we're okay with adding both Parse and TryParse, and think there's enough value (and low effort) for this.

New API proposal:

    public static MailAddress Parse(string address);
    public static MailAddress Parse(string address, string displayName);
    public static MailAddress Parse(string address, string displayName, Encoding displayNameEncoding);

    public static bool TryParse(string address, out MailAddress result);
    public static bool TryParse(string address, string displayName, out MailAddress result);
    public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);

@madelson
Copy link
Contributor

The bigger question is whether we'd recommend MailAddress at all

One reason we've found MailAddress helpful is that it allows you to strongly type variables and fields that store email addresses (as opposed to just storing these as string). Not only does this enforce that invalid values are caught early on in the process, but also it makes it easier for callers to understand what type of value is required. It also gives convenient access to components of the address (like Host).

I see this as comparable to using Uri over string or TimeSpan over int/long.

@Grauenwolf
Copy link
Author

That last question made no sense to me. When I'm checking rows in a database, I'm not going to use a DataAnnotation to detect email addresses.

And if the DA validation is rejecting legal email addresses, then it should be fixed.

@terrajobst
Copy link
Member

  • It seems odd to use Parse/TryParse when there are more components that just a single string that is being parsed.
  • We should follow the Uri.TryCreate() pattern:
    • Remove Parse(), that's what the constructor does
    • Rename TryParse() to TryCreate()
#nullable disable

public static bool TryCreate(string address, out MailAddress result);
public static bool TryCreate(string address, string displayName, out MailAddress result);
public static bool TryCreate(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);

@Grauenwolf
Copy link
Author

Grauenwolf commented Nov 19, 2019

Uri.TryCreate doesn't follow the convention and shouldn't be mimicked. As far as I can tell, it's the only class that uses TryCreate instead of Parse/TryParse.

@maryamariyan maryamariyan transferred this issue from dotnet/corefx Dec 16, 2019
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Dec 16, 2019
@davidsh
Copy link
Contributor

davidsh commented Dec 16, 2019

We should follow the Uri.TryCreate() pattern:

@terrajobst @scalablecory I don't think Uri is the best pattern to follow here. Most of the other types in .NET follow the Parse/TryParse pattern.

@maryamariyan maryamariyan added the api-approved API was approved in API review, it can be implemented label Dec 16, 2019
@maryamariyan maryamariyan added this to the 5.0 milestone Dec 16, 2019
@scalablecory
Copy link
Contributor

Uri.TryCreate doesn't follow the convention and shouldn't be mimicked. As far as I can tell, it's the only class that uses TryCreate instead of Parse/TryParse.

The feedback API review had for this was that this does not follow TryParse convention:

    public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);

TryParse typically only has a single string + optional style + optional formatter as input, i.e. every argument is intrinsic to parsing. What we have here with MailAddress.TryParse, though, is half-way between a parser and a constructor: some of the params are only used for initialization, not parsing.

The suggestion to use TryCreate came from Uri.TryCreate being prior art for a factory half-way between parser and constructor.

I hope this gives some context. @Grauenwolf @davidsh if you have strong opinion on design and can present a good counter-argument, I will take this back to API review.

@Grauenwolf
Copy link
Author

Where are all of these extra parameters coming from? We're just asking for the parsing of a single string.

bool TryParse(string s, out MailAddress result)

While I assume it's possible, I can't think of any situation where I've had a separate display name and email address but didn't already know the email address was valid.

What we're looking for is to check a single column of values from a database or file feed. Usually as part of some sort of import job or data analysis.

@Grauenwolf
Copy link
Author

Also note we do have multi-parameter TryParse methods. For example, DateTime.TryParseExact.

https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tryparseexact?view=netframework-4.8

Other multi-parameter TryParse include Int32.TryParse and Timespan.TryParse

So there is no basis for the claim that TryParse strictly has a single input parameter.

@scalablecory scalablecory removed the untriaged New issue has not been triaged by the area owner label Dec 17, 2019
@scalablecory
Copy link
Contributor

Where are all of these extra parameters coming from? We're just asking for the parsing of a single string.

The proposed API currently mirrors all the ctors, which take more than a single string.

there is no basis for the claim that TryParse strictly has a single input parameter.

This is not a claim anyone made.

Current TryParse take only parameters that are specifically involved in parsing. In the example above, displayName and displayNameEncoding are not used for parsing but instead to initialize values.

@MarcoRossignoli
Copy link
Member

MarcoRossignoli commented Dec 17, 2019

I'm working on this and I agree with @scalablecory if we want to return an "instance" of MailAddress from parse(that in this case I would name input validation) we should allow all "possible" combinations allowed by ctors, for intance current code allow to override the display name also if it's present inside address part, we would lose this. I believe that api should be simmetric in this case.
Would be different if the semantic is only a "validation" one like bool MailAddress.IsValid(string address) and after we'll compose code validation+one of ctors.
A validation api could be also better from performance(no discarded MailAddress instance) perspective in scenario where "I need to check that 1mil email in my csv are ok".

@scalablecory
Copy link
Contributor

Triage: we like TryCreate, can adjust later if there is more discussion. @MarcoRossignoli would you like me to assign this issue to you?

@MarcoRossignoli
Copy link
Member

Yes @scalablecory feel free to assign to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Net
Projects
None yet
Development

Successfully merging a pull request may close this issue.