-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support IPv6 zone identifiers #392
Comments
As noted at https://url.spec.whatwg.org/#concept-ipv6 this is intentionally omitted per https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2. |
Does that mean that users are expected to use a proxy for such situations?
Most users do not have such a proxy installed.
…On Mon, Jun 4, 2018, 12:19 AM Anne van Kesteren ***@***.***> wrote:
As noted at https://url.spec.whatwg.org/#concept-ipv6 this is
intentionally omitted per
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#392 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB-oQ7hfKNXsBMDK_DInSeZM4efoVks5t5LVigaJpZM4UYPOB>
.
|
They have to find an alternative, yes. |
What about requiring browsers to prompt the user for an interface whenever
an address in the fe80::/10 block is entered into the URL?
…On Thu, Jun 7, 2018, 11:22 PM Anne van Kesteren ***@***.***> wrote:
They have to find an alternative, yes.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#392 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB4-MGN77dhpVZNwFpjthSnlVfVEWks5t6e4GgaJpZM4UYPOB>
.
|
What would the prompt say? (FWIW, I doubt any browser would find that acceptable, and it adds a lot of complexity as we'd have to handle the syntax everywhere, which would lead to tons of issues.) |
“Please select a network that your computer is connected to” with a pop-up
listing of all interfaces.
That said, considering the super-niche nature of this, I feel like one
approach that might work would be to allow users to pass the interface as a
command line argument.
…On Fri, Jun 8, 2018, 11:28 PM Anne van Kesteren ***@***.***> wrote:
What would the prompt say? (FWIW, I doubt any browser would find that
acceptable, and it adds a lot of complexity as we'd have to handle the
syntax everywhere, which would lead to tons of issues.)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#392 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGGWBxrzlGgcDdQLRL-eW47Fr4otrkIbks5t60D6gaJpZM4UYPOB>
.
|
With reference to https://bugzilla.mozilla.org/show_bug.cgi?id=700999 I request this bug to be re-opened. I will re-iterate the problem that is caused by the lack of link local address support in here for completeness: I've subscribed to this bug some time ago, because Firefox is essentially preventing to configure newer network devices. I'll elaborate shortly and I ask for clarification from the Mozilla team how to handle this:
Given this background, we often have the situation of devices coming back from customers with any kind of IP configuration. The only plausible way to find the device is using above link local discovery. The IPv4 addresses are often unknown or undiscoverable.
So we are basically in the situation that the link local address is the only reliable address that can be used to configure a whole class of devices. I often hear the argument that with link-local addresses it would be possible to do Javascript based LAN scanning. I am not denying that, however the same is true for IPv4 - you can easily scan 192.168.x.y/24. Even with this inconsistent security claim in mind, I ask the mozilla developers to at least include support for link local with something on the line of about:config->ipv6-allow-link-local: [false,true] to not stop all network engineers from working. Can we reopen this bug or create a new one for realising this? |
Update: as this bug is cross-referencing other resources in each and every bug report, I tried to summarise it on https://ungleich.ch/u/blog/ipv6-link-local-support-in-browsers/ |
This needs to be reopened because firefox is citing this bug as blocking its own fix of the issue. This issue is holding up the entire IPv6 project, in case that isn't clear. There's a lot of game playing around these issues because a lot of money is at stake when it comes to who controls domain names, and this is a way to break domain name systems that compete with ICANN & CAs. So the developers of browsers are pretending that this is complicated when it isn't. It's actually very simple and there's exactly one way to implement it which is very obvious. But implementing it that way is going to step on some toes of people who don't want IPv6 global connectivity and a new p2p internet foundation to happen. It goes against a lot of business models. |
By the way. It's a bug to ever strip the zone identifier before sending it from the client. The way that HTTP works is, the client sends its own name for the server to the server itself. The server never uses this name to establish IP connectivity. The server can then send the name back to the client in links, who can use it to re-establish IP connectivity by clicking a link, or bookmarking it and opening the bookmark, for example. Since the server NEVER uses the name to establish IP connectivity, but only sends it to the client; |
To try and unblock this issue, we've posted a draft update to RFC6874 and discussion is open. Details at https://mailarchive.ietf.org/arch/msg/ipv6/i5LUQN9vU9MryNWtvS_M_O7Wgjc/. |
Much appreciated, @becarpenter ! |
That doesn't fix the percent encoding. |
@afcady, we are stuck with % meaning two things. The discussion on [email protected] is tending towards requiring only the %25eth0 escape encoding and dumping the suggestion to allow %eth0 heuristically. |
The draft has been updated again, following discussion at the recent IETF meeting. As always, a diff from the previous version is available. Input on two open issues is needed from implementers! |
Here are a few thoughts I have on this, without indicating support or opposition: inet_pton seems to be ok with "fe80::abcd%25eth1" and "fe80::abcd%eth1" but not "fe80::abcd-eth1". NSURL seems to be ok with "http://[fe80::abcd%25eth1]/" and "http://[fe80::abcd-eth1]/" but not "http://[fe80::abcd%eth1]/". "fe80::abcd%25eth1" seems to be the most parsable of those examples in my sample of 2 IPv6 host parsers. I'm concerned that if we decide to use "%25" as the delimiter to indicate the beginning of a zone id, some software will interpret "25eth1" to be the zone id and some will interpret "eth1" to be the zone id. All browsers currently fail to parse all of those examples. It is clear that software will need to change if we decide to support this. If compatibility weren't a concern, I think it would be nicest to introduce a new delimiter such as '-'. I'm curious how someone would get a zone id to use. Some systems might use "eth1" as a meaningful zone id, while other systems might use "en1" or "1". If this is the case, it makes me question the uniformity of these URIs. Your document says "However, the IPv6 Scoped Address Architecture specification gives no precise definition of the character set allowed in <zone_id>. There are no rules or de facto standards for this." In order to be a part of the URL specification, we need a precise parsing definition for all possible input, such as "http://[fe80::abcd%25💩]/" which I imagine would need to percent-encode the emoji, or "http://[fe80::abcd%25%invalid]/" which seems to fail to parse. |
Windows UNC paths apparently use ‘s’ to delimit a zone ID. No idea why they chose ‘s’ but it doesn’t have the same problems that ‘%’ does in a URL context, so maybe that’s also worth considering. |
@karwa Isn't the "s" only used in the context of a domain name? The referenced example on wikipedia says |
ping6 interprets "fe80::abcd%25en0" to have a zone id of 25en0, so the current proposal isn't compatible with that |
That becomes |
It's well understood that percent-encoding makes pure cut and paste
impossible. Operations people can deal with that if they must.
Regards,
Brian Carpenter
(via tiny screen & keyboard)
…On Thu, 26 Aug 2021, 18:39 Demi Marie Obenour, ***@***.***> wrote:
ping6 interprets "fe80::abcd%25en0" to have a zone id of 25en0, so the
current proposal isn't compatible with that
That becomes fe80::abcd%en0 after URL decoding.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#392 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABMKET3LMZYI3R3V37CYOJTT6XOTJANCNFSM4FDA6OAQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
There's a new draft at https://www.ietf.org/archive/id/draft-carpenter-6man-rfc6874bis-03.html. |
The IETF 6MAN WG has just formally adopted our document draft-ietf-6man-rfc6874bis-00. All we need are developers who understand all the places where URLs are parsed (there are probably several) and where the actual socket calls are made. I'm glad to help if developers contact me. |
BTW, this bug was closed in June 2018 based on arguments at https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2. Those arguments were against various features of RFC6874. The new draft is quite different and (if published as an RFC) will remove all those annoying features. |
New version of the draft published: https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-01.html. |
Just noting that https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-02.html came out a while ago and is now in review by the appropriate IETF Area Director. w.r.t. some of the comments above, getting rid of the percent-encoding seems to make the parsing issues quite a bit less thorny, but we really need implementers to look at that question. |
The relevant URI syntax update is now in IETF Last Call, i.e. the last opportunity for public comments: https://mailarchive.ietf.org/arch/msg/ietf-announce/BqBF9qvZ8qZR4ZPlawPvQSe0WbU/ |
Worth mentioning that the draft has been updated following Last Call comments: https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-03.html |
Another very minor update: https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-04.html |
RFC-5952 mentions some of the problems that arise due to the flexibility of textual IPv6 addresses and the benefits of having a single, canonical textual representation. Given that zone IDs are opaque ASCII strings, I guess that no normalization can be applied to them, correct? In other words,
https://www.rfc-editor.org/rfc/rfc3986.html#section-3.2.2
|
Good catch. RFC4007 says nothing about case (in fact, it says nothing useful about the Zone ID string at all). Running code (a.k.a. ping on Linux) tells me that implementations are case-sensitive, which is of course the implication of saying nothing. |
I recommend having the zone ID be case-sensitive, to reflect what current implementations do. |
I don't know how to do that without causing a major problem for the URI parsers in every browser. |
Why would this cause such a problem? |
I don't think the problem is with browsers specifically. The issue is that the new RFC is defined in terms of RFC-3986 and updates it, but 3986 makes quite a broad promise of hosts being case-insensitive. It does not even restrict this to certain kinds of hosts - it just says "the host subcomponent". So it's extremely broad, and there may be applications which rely on that. For example, imagine I have some sort of application-level cache - I can treat requests to This new RFC would make an incompatible change to 3986, by taking away that promise and saying that some hosts may actually be case-sensitive, and that if you just lowercase them as was previously allowed, you might be meaningfully altering which host is being referred to. The WHATWG URL standard would actually be more accommodating of case-sensitive elements within IP literals than RFC-3986, because we don't make the same broad guarantee. In the WHATWG model, the parser takes a string and creates a URL record from it, and that URL record can contain a host (which is also a record, containing the parsed IP address value). The URL serialiser produces the canonical textual form of that URL record, so nobody needs to do things like manually lowercasing hostnames, and nowhere in the standard does it recommend that anybody does so themselves; the output is already normalised to the extent the standard defines things to be equivalent:
|
Thanks @karwa. I agree that any URI parser or decoder would have this problem, not just those in browsers. As soon as they have separated out the host part of the URI, any programmer will normalise the whole thing to lower case before analysing whether it's example.com, 1.2.3.4, [::abcd] or [::abcd%upper]. |
Just to confirm, for the case of wget (patched to support RFC6874bis), if I do
it responds
In other words, wget normalises the host component to lower case, as expected. (The patch to wget is at https://github.com/becarpenter/wget6) |
New version of the draft today : https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-05.html |
Would it be possible to make zone IDs with upper case letters an error? That way if in the future it is possible to support them, it can be added backwards-compatibly. |
Please don't. On linux it's just case sensitive, upper cases are not invalid per se.
) I don't think anyone is daring enough to do that in practice so I don't think having parsers assume one is equal to the other would be a problem, but case should be preserved for the actual host resolution/connect/sendto/whatever call |
I still think parsers should be required to be case-sensitive here. Case-preserving is the bare minimum. |
I'd be delighted if I thought that was reasonably possible, but having looked at some of the Firefox code, I really, really doubt it. |
What would be required for it to work? Major refactoring? |
You'd need to hop over to https://bugzilla.mozilla.org/show_bug.cgi?id=700999 and ask there. |
Hi all, In order to make some progress on this topic I would like to propose a compromise change to the URL standard that punts on all the hard questions about zone ID. As indicated in Martin's feedback most browsers still have a problem with the zoneID and wouldn't implement it. However, URLs with a zoneID still exist, and the fact that URL parsers consider them invalid isn't great. The changes to the URL parsing algorithm would be minimal: @annevk if this is acceptable I will send a PR. Hopefully this is non-controversial enough to be acceptable to Blink and WebKit too. |
I think that warrants a new issue. It's not clear to me that is a good idea because the authority question remained unresolved. If it should impact authority and we end up treating multiple distinct authorities as one, that would not be good. And while there are plenty of ways to make a URL appear like another one, I'm not sure we want to add to that problem. Also in other domains ignoring all input after a certain character has led to injection attacks. How would we avoid those here? It's worth discussing, but I wouldn't classify it as non-controversial. |
IMO, we should support Zone IDs. Fundamentally, no host has a universally-guaranteed meaning. The URL standard does not define what hosts actually mean, and generally the assumption is that they will be passed to a system resolver. How that resolver works is undefined, and in general, different systems will do different things, and allow for the user to customise different parts of the process. For instance, the DNS itself can be heavily customised - both by the user, and by the backend. Users can provide custom DNS servers (e.g. Google public DNS), and ISPs can direct queries to particular servers using dedicated physical infrastructure, on-site caches, or to alternate websites (let's imagine the state has a problem with website X and wants to send users to a more ideologically-appropriate site). Ultimately, we have no way to detect any of that. We have no idea what the hostname IP addresses are similarly fuzzy. Two machines with different network configurations may have different understandings of what a given address should mean. We give an IP address to the system, and it connects to some machine, and that's about as much as we can say about it. It doesn't come with nearly as much ambiguity as domains have, but it's all still client-specific. So when I see arguments such as:
And
I think it overstates how much we can actually rely on existing hostnames to be unique, and it fails to explain how But more to the point, I think it misrepresents what URLs are. URLs are universal identifiers, but that does NOT mean that they contain the universal identity of a resource. It just means that they subsume all other kinds of identifiers. It is perfectly fine to use URLs to identify data in a local application - e.g. something like URLs are, IMO, simply a flexible syntax for expressing the different kinds of identifiers that exist, so that any application can see the URL And I think it should be possible to express these kinds of locations under the Of course, no client is obligated to support a particular kind of host. I don't see any technical for doing so, but browsers should be allowed to decline requests to such URLs if they wish. I hope they would at least make it a configurable option rather than an outright ban. |
The URL Standard and standards that build on it do end up using and exposing the host in quite a few ways. So perhaps meaning is not strictly-speaking defined, but there is a lot of behavior build on top that is outside the realm of DNS. We cannot just change the syntax without addressing that. I told the RFC authors repeatedly that syntax isn't really the problem here. It's the end-to-end integration. And even if someone solved that, there's also the problem of getting implementer interest, which is a requirement per our Working Mode. And thus far I've largely seen opposition on that front. |
@valenting : "The use case that I encountered was that my printer settings was pointing me towards a URL containing a zoneID - obviously that failed to parse, so I had to manually remove the zoneID from the URL to access it." That only works if you're lucky enough to have your printer on the default link (aka zone). As home networks get more complex that isn't guaranteed, although I agree that it's a useful fix for the common case. |
Should have added that the default zone is only a SHOULD in the underlying RFC4007, and as far as I know Linux doesn't support a default zone, although Windows does. |
The built-in URL class dropped support for zone identifiers in IPv6 address literals in Node.js 20. Calling the URL constructor with a URL containing a zone identifier causes ERR_INVALID_URL to be thrown. This is likely a result of switching to version 2.0 of the Ada URL parser in <nodejs/node#47339>. The behavior aligns with how [IPv6 address is defined in the WHATWG URL Standard](https://url.spec.whatwg.org/#concept-ipv6), which notes that > Support for <zone_id> is intentionally omitted. As explained in the issue tracker: https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2 whatwg/url#392 Skip this test, since this URL format is not supported. If it's necessary to support SCP-like git URLs with zone identifiers, we'll need to roll our own support. Signed-off-by: Kevin Locke <[email protected]>
Currently, there is no way to point a browser at
fe80::1%lo
.Proposed syntax:
https://[fe80::1%25lo]:80
The text was updated successfully, but these errors were encountered: