-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trio.ssl handling for Unicode (IDNA) domains is deeply broken #11
Comments
I have learned a little bit: For hostname lookup, the versions of python we care about will automatically idna-encode unicode strings passed to The For SSL hostname verification...... we certainly can't assume that stdlib ssl will do the right thing. (I guess at least it fails closed?) And it's not terribly flexible. ...maybe @Lukasa's new PEP will save us? Like everything involving SSL, this appears to be not at all trivial: https://github.com/kennethreitz/requests/blob/362da46e9a46da6e86e1907f03014384ab210151/requests/packages/urllib3/contrib/pyopenssl.py#L141 |
You'll need to make sure that the domain name actually is an IDNA, otherwise that'll blow up. The |
Are there things that |
Oh, but This is making step 2, "figure out how to make idna and x.509 play together" look extra exciting let me tell you. how does anyone do anything. |
It's not cranky at the apostrophe, it's the colon it doesn't like. |
The error message is confusing, but the ultimate issue is that the colon isn't allowed. |
...ok, Why is there no documentation on how you look up domain names. |
Yeah, I figured, but it's one of those bugs that makes you nervous about what else is going on in there :-) |
It looks like I had idna 2.1 installed, and upgrading to 2.5 replaces the really weird error message with a more sensible error message, so that's OK. OK, how to actually look up hostnames: from looking at This bug has a lot of context. This PR landed the workarounds for idna's strictness. This particular strategy appears to be based as much on trial-and-error as anything else, but at least I can learn from their suffering. The one thing that bothers me then is that I do not at all understand the special case ruling out They also make sure to convert back to unicode before continuing, and if you pass a unicode to Conclusion: def getaddrinfo(host, ...):
if isinstance(host, str):
try:
host = host.encode("ascii")
except UnicodeEncodeError:
host = idna.encode(host, uts46=True)
return real_getaddrinfo(host, ...) and then write tests where ...and figure out why/whether |
It looks like the tests that make sure It also looks like the stdlib ssl module might do the right thing if we hand it a hostname that's been pre-encoded to xn--whatever, but then ascii-decoded back into a |
And it looks like @tiran's new way of doing hostname checking ultimately comes down to a call to |
Python's stdlib isn't a good example. IDNA is a messy, here are some fun examples and guide lines
For IDNA 2008 / UTS46 you have to split the FQDN on |
Oh, here's an obstacle to doing our own IDNA handling while continuing to use the stdlib ssl module: the stdlib ssl module does I think this might just be an annoying bug that doesn't stop us from upgrading IDNA in other contexts, though. |
ah nice, in fact if you run a server at faß.com, and someone connects and correctly sets their SNI to In [3]: idna.encode("faß.com").decode("idna")
UnicodeError: ('IDNA does not round-trip', b'xn--fa-hia', b'fass') |
I was wrong, actually the stdlib This is not something we can fix; as long as What we can do now is switch to using IDNA 2008 correctly in And in the longer run, I guess either |
The above post will inevitably interest @tiran. |
This fixes part of python-triogh-11.
Specifically: - Use IDNA 2008 in getaddrinfo - Require pre-resolved names in getnameinfo (b/c otherwise it makes an implicit call to getaddrinfo using IDNA 2003; also, this makes it more consistent with the rest of trio.socket). This fixes part, but not all, of python-triogh-11.
#206 makes it so that we now correctly handle IDNA at the |
IDNA handling in |
SSLContext.sni_callback is a better replacement for set_servername_callback (that was added in part because of my whining about trio bug python-triogh-11). Let's link to it instead. This also fixes a silly issue where we were using :meth: to link to SSLContext.set_servername_callback, but in the 3.7 docs that's documented as being an attribute. Even though it's a method. Oh well, whatever, if we don't link to it we don't have to care.
trio.ssl
relies on the stdlibssl
module for hostname checking. Unfortunately, when it comes to non-ASCII domain names, it's totally broken. Which means thattrio.ssl
is also totally broken. In other ways, trio handles IDNA well (better than the stdlib). But this is hard to work around.We don't have a lot of great options here. We could:
Substantially rewrite
trio.ssl
to use PyOpenSSL instead. But this requires adding a bunch of tricky code, and also would break our API. (We currently take a stdlibssl.SSLContext
object as input, and it's impossible to read the SNI callback and other critical attributes off of anssl.SSLContext
object, which means that it's impossible to correctly convert a stdlibSSLContext
into a PyOpenSSL equivalent. See bpo-32359.)Hope that upstream
ssl
gets better, or fix it ourselves. Not impossible, but maybe not a thing to hold our breath on either.Hope that PEP 543 comes along and saves us.
....?
I'm not sure what to do, but at least now we have a bug to track the problem...
Original text:
IDNA support in trio.socket
Help I have no idea what I'm doing here.
Probably also touches on TLS support, #9.
The text was updated successfully, but these errors were encountered: