API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916

NickCraver · 2018-07-24T00:09:26Z

namespace System.Net
{
    public class IPEndPoint
    {
        public static IPEndPoint Parse(string s);
        public static IPEndPoint Parse(ReadOnlySpan<char> s);
        public static bool TryParse(string s, out IPEndPoint result);
        public static bool TryParse(ReadOnlySpan<char> s, out IPEndPoint result);
    }
}

Origin issue: #23289

Rationale

Parsing a user-provided endpoint is often needed for client libraries doing connections with Sockets as well as other usages like parsing for valid endpoints in things like HTTP headers. The ASP.NET already has the implementation in IPEndPointParser.TryParse for the HTTP headers use case.

User implementations are also often buggy, inefficient, outdated, or all 3 (lots of examples on GitHub as well). A lot of this has to do with legacy code. For example, a lot of code in the wild is splitting on :, because when IPv4 was our main protocol you're only talking about 1.1.1.1:1234 (<ip>:<port>), so a bunch of code does a .Split(':') or a .IndexOf(':'). It worked well enough. Now that we have IPv6 you have examples like ::1 or [::1]:123, or [2001:db7:85a3:8d2:1319:8a2e:370:7348]:1000...and all that code is broken.

A framework method for parsing would allow users to (with zero ongoing effort in some or many cases) keep up with any change in IP parsing as well as do it efficiency and correctly. For example, the common .Split(':') and its array allocations are inefficient and unnecessary. With the [...] delimiters and such it's also easy to get wrong.

cc @stephentoub @terrajobst @Tratcher @benaadams

The text was updated successfully, but these errors were encountered:

vcsjones · 2018-07-24T00:23:34Z

Item for discussion... is there a use case for overloads accepting ReadOnlySpan<char>?

NickCraver · 2018-07-24T00:29:40Z

To @vcsjones's point, we do already have the case. For example, a connection string to StackExchange.Redis could be something like: "[::1]:6379,syncTimeout=10000"...so we would be parsing a part of the string for the most efficient case, as would others. To be fair, that's a trivial amount of cost compared to the off-the-box network IP.

HTTP headers has the same issue, for example X-Forwarded-For can be one of many comma-delimited IPs, or the Forwarded header would again be partials.

So...yeah, I can definitely see the use case. I'll go ahead and add them to the proposal.

GrabYourPitchforks · 2018-07-24T01:58:55Z

What are the byte-based overloads for?

NickCraver · 2018-07-24T02:17:15Z

@GrabYourPitchforks me switching between coding and GitHub too much - those should be ReadOnlySpan<char>...fixed.

stephentoub · 2018-09-25T16:32:13Z

Seems reasonable; marking as ready for review.

terrajobst · 2018-09-25T18:25:17Z

Video

We should probably also expose this for DnsEndPoint, but that should be another issue. This one looks good as proposed.

jbhensley · 2018-10-28T02:51:36Z

It seems the easiest implementation would be to start at the end of the string and extract the port, working backwards until you reach the address/port delimiter. Whatever is left can be passed off to the existing Parse routines under IPAddress. Question: is there any precedent as to what may constitute the delimiter? RFC 5952, section 6 lists several forms for IPv6:

[2001:db8::1]:80
2001:db8::1:80
2001:db8::1.80
2001:db8::1 port 80
2001:db8::1p80
2001:db8::1#80

It's not clear that we would want to support all of these.

Tratcher · 2018-10-28T03:33:30Z

[2001:db8::1]:80, [2001:db8::1], and 2001:db8::1 are the primary use cases. I've never seen any of those other formats used.

stephentoub · 2018-10-28T14:33:12Z

If there's a port, we should only accept the []: syntax. The RFC you linked to had a similar sentiment just after the quoted list:

"The [] style as expressed in [RFC3986] SHOULD be
employed, and is the default unless otherwise specified. Other
styles are acceptable when there is exactly one style for the given
context and cross-platform portability does not become an issue. For
URIs containing IPv6 address literals, [RFC3986] MUST be followed, as
well as the rules defined in this document."

NickCraver · 2018-10-28T14:53:02Z

Completely agree with the above 3 only. The other formats are ambiguous, I've never seen them in the wild (but don't claim to be an expert), and I bet they die off. If we haven't seen similar calls for support on the ASP.NET Core side, we shouldn't assume they're needed here. They could always be added later, if assumptions are wrong here.

jbhensley · 2018-10-28T16:33:12Z

Easy enough. The existing IPAddress parser for IPv6 already handles that syntax, we just need to make sure the syntax is used.

Our advantage is in knowing that the string value must include a port since this is IPEndPoint parse. No port == invalid. Basically, I envision Parse/TryParse using something like this at its core.

if (epSpan != null)
{
	int port = 0;
	int multiplier = 1;
	for (int i = epSpan.Length - 1; i >= 0; i--)
	{
		char digit = epSpan[i];
		if(digit == AddressPortDelimiter)
		{
			// We've reached the end.
			Console.WriteLine($"Port: {port}");
			Console.WriteLine(epSpan.Slice(0, i).ToString());   // "Address" (pass to IPAddress Parse/TryParse)
			// If we get an IPv6 address back, ensure that epSpan[i-1] == ']'
		}
		else if ('0' <= digit && digit <= '9')
		{
			port += multiplier * (digit - '0');
			if(port > MaxPort)
			{
				// If we've already exceeded the max port value then bail. No point going until we (potentially) hit an overflow
				break;
			}
			multiplier *= 10;
		}
		else
		{
			break;  // Invalid digit
		}
	}

// If the loop runs out without hitting a delimiter then we don't have both an address and a port

If not wildly off track, I'd like to take a crack at this one.

Tratcher · 2018-10-28T17:16:51Z

We already have an initial implementation here.

Also, I expect the port to be optional and default to 0.

jbhensley · 2018-10-28T17:50:24Z

Ah, I see. Yes, the approach I outlined only works when it can be assumed that the port is included, which makes implementation fairly trivial.

jbhensley · 2018-10-28T18:52:24Z

@Tratcher You're implementation looks pretty good. After pondering a bit more though, I do think that the "backwards" way outlined above could be modified to handle optional port and avoid multiple IndexOf calls or any Substring calls (i.e. hopefully process faster). Will look at this a bit later (when not in the middle of carving pumpkins with my kids ;) ... sorry).

jbhensley · 2018-10-29T03:05:47Z

Ok, I think we're probably looking at something more like below (minimally tested). There's a complication here if we need to support port numbers in non-decimal numeric base (e.g. "[Fe08::1]:0xFA") which neither this nor the implementation presented by @Tratcher does. I think I can do that in parallel with processing as decimal as I will be unsure (working backward) of the base until I hit the leading "0x" sequence, a colon or an alpha hex digit. Either the decimal or hex value can be discarded once the base is determined.

I'll continue work on this method unless there are any major objections.

public static bool TryParse(ReadOnlySpan<char> epSpan, out IPEndPoint endPoint)
{
	endPoint = null;
	if (epSpan != null)
	{
		// Determine whether to process as IPv4 or IPv6
		bool processAsIPv4 = false;
		if (epSpan.Length > 4)
		{
			// Check to see if this might be IPv4
			for (int i = 1; i < 4; i++) // Skip position zero. If a dot shows up there, this thing is invalid anyway
			{
				if (epSpan[i] == '.')
				{
					processAsIPv4 = true;
					break;
				}
			}
		}

		// Start at the end of the string and work backward, looking for a port delimiter
		int portDecimal = 0;
		int multiplier = 1;
		// TODO: we need to process cases where the port is expressed as Hex
		int sliceLength = epSpan.Length;
		for (int i = epSpan.Length - 1; i >= 0; i--)
		{
			char digit = epSpan[i];

			// Locating a delimiter ends the sequence. We either have IPv4 with a port or IPv6 (with or without) or we have garbage
			if (digit == ':')
			{
				// Determine how far to slice into the span (pre-set to entire length)
				if (processAsIPv4 || (i > 0 && epSpan[i - 1] == ']'))
				{
					sliceLength = i;
				}

				break;
			}
			else if ('0' <= digit && digit <= '9')
			{
				// We'll avoid overflow in cases where someone passes in garbage
				if (portDecimal < MaxPort)
				{
					portDecimal += multiplier * (digit - '0');
					multiplier *= 10;
				}
			}
		}

		// We've either hit the delimiter or ran out of characters. Let's see what the IP parser thinks.
		// TODO: this can likey be optimized in core since we already know what kind of address we have
		if (IPAddress.TryParse(epSpan.Slice(0, sliceLength), out IPAddress address))
		{
			// If we did not hit a delimiter then default to port 0
			if(sliceLength == epSpan.Length)
			{
				portDecimal = 0;
			}

			// Avoid tossing on invalid port
			if (portDecimal <= MaxPort)
			{
				endPoint = new IPEndPoint(address, portDecimal);
				return true;
			}
		}
	}

	return false;
}

stephentoub · 2018-10-29T13:45:34Z

if we need to support port numbers in non-decimal numeric base

Why would we need to do that?

jbhensley · 2018-10-29T14:27:27Z

I noticed that hex port numbers are already included in the unit tests for IPv6 parsing:

https://github.com/dotnet/corefx/blob/73bc1413112c0d60f295ccffb40e2cf85638f177/src/System.Net.Primitives/tests/FunctionalTests/IPAddressParsing.cs#L279

We should reach a consensus on whether to support that here (the IP parser just drops the port, it does not actually do anything with it).

There are some other minor issues as well. IPv4 encoded into IPv6 ("::192.168.0.010") fails my "is this IPv4" test in the code above, but that should be an easy fix. Few other small things as well. Just wanted to point that one out in case anyone notices it.

Tratcher · 2018-10-29T15:06:03Z

@jbhensley note that test is only verifying IP parsing, not port parsing. The ports are ignored.

jbhensley · 2018-10-29T15:36:17Z

Noted. The presence of a hex port, however, does raise the issue of whether it is an expected, valid input. Don't misunderstand, I'm all in favor of base-10 only support; it makes coding this much easier. Just so long as we all know that it's an intentional decision.

jbhensley · 2018-10-29T17:05:39Z

@Tratcher FYI: There are some errors in the logic on your parser, mainly centered around relative position of ":" and "]". Try running this address through:

"[3731:54:65fe:2::1]IAmNotValid65535"

You'll get address 3731:54:65fe:2::1 on port 0 because the code assumes it's IPv6 without a port. This is due to the fact that the last position of "]" is greater than the last position of ":". The input is invalid anyway, so no harm no foul. I'll put stuff like this in our unit tests to make sure core returns false/tosses (for TryParse/Parse respectively).

jbhensley · 2018-10-30T00:06:20Z

PR dotnet/corefx#33119 opened for feedback.

davidsh · 2018-11-13T22:38:43Z

PR dotnet/corefx#33119 has been merged into master already.

davidsh closed this as completed Nov 13, 2018

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 3.0 milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Dec 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916

API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916

NickCraver commented Jul 24, 2018

vcsjones commented Jul 24, 2018

NickCraver commented Jul 24, 2018

GrabYourPitchforks commented Jul 24, 2018

NickCraver commented Jul 24, 2018

stephentoub commented Sep 25, 2018

terrajobst commented Sep 25, 2018 •

edited

Loading

jbhensley commented Oct 28, 2018

Tratcher commented Oct 28, 2018

stephentoub commented Oct 28, 2018

NickCraver commented Oct 28, 2018

jbhensley commented Oct 28, 2018

Tratcher commented Oct 28, 2018

jbhensley commented Oct 28, 2018

jbhensley commented Oct 28, 2018 •

edited

Loading

jbhensley commented Oct 29, 2018

stephentoub commented Oct 29, 2018

jbhensley commented Oct 29, 2018

Tratcher commented Oct 29, 2018

jbhensley commented Oct 29, 2018

jbhensley commented Oct 29, 2018

jbhensley commented Oct 30, 2018

davidsh commented Nov 13, 2018

API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916

API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916

Comments

NickCraver commented Jul 24, 2018

Rationale

vcsjones commented Jul 24, 2018

NickCraver commented Jul 24, 2018

GrabYourPitchforks commented Jul 24, 2018

NickCraver commented Jul 24, 2018

stephentoub commented Sep 25, 2018

terrajobst commented Sep 25, 2018 • edited Loading

jbhensley commented Oct 28, 2018

Tratcher commented Oct 28, 2018

stephentoub commented Oct 28, 2018

NickCraver commented Oct 28, 2018

jbhensley commented Oct 28, 2018

Tratcher commented Oct 28, 2018

jbhensley commented Oct 28, 2018

jbhensley commented Oct 28, 2018 • edited Loading

jbhensley commented Oct 29, 2018

stephentoub commented Oct 29, 2018

jbhensley commented Oct 29, 2018

Tratcher commented Oct 29, 2018

jbhensley commented Oct 29, 2018

jbhensley commented Oct 29, 2018

jbhensley commented Oct 30, 2018

davidsh commented Nov 13, 2018

terrajobst commented Sep 25, 2018 •

edited

Loading

jbhensley commented Oct 28, 2018 •

edited

Loading