-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Proposal: Add IPEndPoint.Parse() & .TryParse() #26916
Comments
Item for discussion... is there a use case for overloads accepting |
To @vcsjones's point, we do already have the case. For example, a connection string to StackExchange.Redis could be something like: HTTP headers has the same issue, for example So...yeah, I can definitely see the use case. I'll go ahead and add them to the proposal. |
What are the byte-based overloads for? |
@GrabYourPitchforks me switching between coding and GitHub too much - those should be |
Seems reasonable; marking as ready for review. |
We should probably also expose this for |
It seems the easiest implementation would be to start at the end of the string and extract the port, working backwards until you reach the address/port delimiter. Whatever is left can be passed off to the existing
It's not clear that we would want to support all of these. |
|
If there's a port, we should only accept the []: syntax. The RFC you linked to had a similar sentiment just after the quoted list: "The [] style as expressed in [RFC3986] SHOULD be |
Completely agree with the above 3 only. The other formats are ambiguous, I've never seen them in the wild (but don't claim to be an expert), and I bet they die off. If we haven't seen similar calls for support on the ASP.NET Core side, we shouldn't assume they're needed here. They could always be added later, if assumptions are wrong here. |
Easy enough. The existing Our advantage is in knowing that the string value must include a port since this is if (epSpan != null)
{
int port = 0;
int multiplier = 1;
for (int i = epSpan.Length - 1; i >= 0; i--)
{
char digit = epSpan[i];
if(digit == AddressPortDelimiter)
{
// We've reached the end.
Console.WriteLine($"Port: {port}");
Console.WriteLine(epSpan.Slice(0, i).ToString()); // "Address" (pass to IPAddress Parse/TryParse)
// If we get an IPv6 address back, ensure that epSpan[i-1] == ']'
}
else if ('0' <= digit && digit <= '9')
{
port += multiplier * (digit - '0');
if(port > MaxPort)
{
// If we've already exceeded the max port value then bail. No point going until we (potentially) hit an overflow
break;
}
multiplier *= 10;
}
else
{
break; // Invalid digit
}
}
// If the loop runs out without hitting a delimiter then we don't have both an address and a port If not wildly off track, I'd like to take a crack at this one. |
We already have an initial implementation here. Also, I expect the port to be optional and default to 0. |
Ah, I see. Yes, the approach I outlined only works when it can be assumed that the port is included, which makes implementation fairly trivial. |
@Tratcher You're implementation looks pretty good. After pondering a bit more though, I do think that the "backwards" way outlined above could be modified to handle optional port and avoid multiple |
Ok, I think we're probably looking at something more like below (minimally tested). There's a complication here if we need to support port numbers in non-decimal numeric base (e.g. "[Fe08::1]:0xFA") which neither this nor the implementation presented by @Tratcher does. I think I can do that in parallel with processing as decimal as I will be unsure (working backward) of the base until I hit the leading "0x" sequence, a colon or an alpha hex digit. Either the decimal or hex value can be discarded once the base is determined. I'll continue work on this method unless there are any major objections. public static bool TryParse(ReadOnlySpan<char> epSpan, out IPEndPoint endPoint)
{
endPoint = null;
if (epSpan != null)
{
// Determine whether to process as IPv4 or IPv6
bool processAsIPv4 = false;
if (epSpan.Length > 4)
{
// Check to see if this might be IPv4
for (int i = 1; i < 4; i++) // Skip position zero. If a dot shows up there, this thing is invalid anyway
{
if (epSpan[i] == '.')
{
processAsIPv4 = true;
break;
}
}
}
// Start at the end of the string and work backward, looking for a port delimiter
int portDecimal = 0;
int multiplier = 1;
// TODO: we need to process cases where the port is expressed as Hex
int sliceLength = epSpan.Length;
for (int i = epSpan.Length - 1; i >= 0; i--)
{
char digit = epSpan[i];
// Locating a delimiter ends the sequence. We either have IPv4 with a port or IPv6 (with or without) or we have garbage
if (digit == ':')
{
// Determine how far to slice into the span (pre-set to entire length)
if (processAsIPv4 || (i > 0 && epSpan[i - 1] == ']'))
{
sliceLength = i;
}
break;
}
else if ('0' <= digit && digit <= '9')
{
// We'll avoid overflow in cases where someone passes in garbage
if (portDecimal < MaxPort)
{
portDecimal += multiplier * (digit - '0');
multiplier *= 10;
}
}
}
// We've either hit the delimiter or ran out of characters. Let's see what the IP parser thinks.
// TODO: this can likey be optimized in core since we already know what kind of address we have
if (IPAddress.TryParse(epSpan.Slice(0, sliceLength), out IPAddress address))
{
// If we did not hit a delimiter then default to port 0
if(sliceLength == epSpan.Length)
{
portDecimal = 0;
}
// Avoid tossing on invalid port
if (portDecimal <= MaxPort)
{
endPoint = new IPEndPoint(address, portDecimal);
return true;
}
}
}
return false;
} |
Why would we need to do that? |
I noticed that hex port numbers are already included in the unit tests for IPv6 parsing: We should reach a consensus on whether to support that here (the IP parser just drops the port, it does not actually do anything with it). There are some other minor issues as well. IPv4 encoded into IPv6 ("::192.168.0.010") fails my "is this IPv4" test in the code above, but that should be an easy fix. Few other small things as well. Just wanted to point that one out in case anyone notices it. |
@jbhensley note that test is only verifying IP parsing, not port parsing. The ports are ignored. |
Noted. The presence of a hex port, however, does raise the issue of whether it is an expected, valid input. Don't misunderstand, I'm all in favor of base-10 only support; it makes coding this much easier. Just so long as we all know that it's an intentional decision. |
@Tratcher FYI: There are some errors in the logic on your parser, mainly centered around relative position of ":" and "]". Try running this address through: "[3731:54:65fe:2::1]IAmNotValid65535" You'll get address 3731:54:65fe:2::1 on port 0 because the code assumes it's IPv6 without a port. This is due to the fact that the last position of "]" is greater than the last position of ":". The input is invalid anyway, so no harm no foul. I'll put stuff like this in our unit tests to make sure core returns false/tosses (for |
PR dotnet/corefx#33119 opened for feedback. |
PR dotnet/corefx#33119 has been merged into master already. |
Origin issue: #23289
Rationale
Parsing a user-provided endpoint is often needed for client libraries doing connections with Sockets as well as other usages like parsing for valid endpoints in things like HTTP headers. The ASP.NET already has the implementation in
IPEndPointParser.TryParse
for the HTTP headers use case.User implementations are also often buggy, inefficient, outdated, or all 3 (lots of examples on GitHub as well). A lot of this has to do with legacy code. For example, a lot of code in the wild is splitting on
:
, because when IPv4 was our main protocol you're only talking about1.1.1.1:1234
(<ip>:<port>
), so a bunch of code does a.Split(':')
or a.IndexOf(':')
. It worked well enough. Now that we have IPv6 you have examples like::1
or[::1]:123
, or[2001:db7:85a3:8d2:1319:8a2e:370:7348]:1000
...and all that code is broken.A framework method for parsing would allow users to (with zero ongoing effort in some or many cases) keep up with any change in IP parsing as well as do it efficiency and correctly. For example, the common
.Split(':')
and its array allocations are inefficient and unnecessary. With the[...]
delimiters and such it's also easy to get wrong.cc @stephentoub @terrajobst @Tratcher @benaadams
The text was updated successfully, but these errors were encountered: