Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip HSTS preloading for single-label domains #1025

Closed
2 tasks done
alex-oleshkevich opened this issue Jun 16, 2020 · 10 comments
Closed
2 tasks done

Skip HSTS preloading for single-label domains #1025

alex-oleshkevich opened this issue Jun 16, 2020 · 10 comments
Labels
bug Something isn't working good first issue Good for newcomers user-experience Ensuring that users have a good experience using the library

Comments

@alex-oleshkevich
Copy link
Member

alex-oleshkevich commented Jun 16, 2020

Checklist

  • The bug is reproducible against the latest release and/or master.
  • There are no similar issues or pull requests to fix it yet.

Describe the bug

I run two services in docker-compose. One of them named "app" which is Django application.
During startup, another service makes HTTP call to "app" service:
GET http://app:8000

The request then fails with ConnectionReset error because it was rewritten to HTTPS.

I dug into the issue and found that httpx._client.BaseClient.merge_url checks against hstspreload.in_hsts_preload and, if succeeds, replaces http with https.

Inside of hstspreload.in_hsts_preload I found _GTLD_INCLUDE_SUBDOMAINS constant that includes app value and the check I have mentioned before is always True.

This means, that any request to any host that is listed in _GTLD_INCLUDE_SUBDOMAINS will be rewritten to HTTPS.

To reproduce

Dockerfile, docker-compose.yml:
https://gist.github.com/alex-oleshkevich/6695124d41208c5dc692a30ea817ea82

Expected behavior

The scheme should not be rewritten.

Actual behavior

URL http://app:8000 was changed to https://app:8000 by httpx.

Debugging material

Environment

  • OS: Arch Linux
  • Python version: 3.8
  • HTTPX version: 0.13
  • Async environment: asyncio
  • HTTP proxy: no
  • Custom certificates: no

Additional context

_GTLD_INCLUDE_SUBDOMAINS = {
    b"android",
    b"app",
    b"bank",
    b"chrome",
    b"dev",
    b"foo",
    b"gle",
    b"gmail",
    b"google",
    b"hangout",
    b"insurance",
    b"meet",
    b"new",
    b"page",
    b"play",
    b"search",
    b"youtube",
}
# hstspreload.in_hsts_preload

@functools.lru_cache(maxsize=1024)
def in_hsts_preload(host: typing.AnyStr) -> bool:
    """Determines if an IDNA-encoded host is on the HSTS preload list"""

    if isinstance(host, str):
        host = host.encode("ascii")
    labels = host.lower().split(b".")

    # Fast-branch for gTLDs that are registered to preload all sub-domains.
    if labels[-1] in _GTLD_INCLUDE_SUBDOMAINS:
        return True

# rest omitted
# httpx._client.BaseClient.merge_url
    def merge_url(self, url: URLTypes) -> URL:
        """
        Merge a URL argument together with any 'base_url' on the client,
        to create the URL used for the outgoing request.
        """
        url = self.base_url.join(relative_url=url)
        if url.scheme == "http" and hstspreload.in_hsts_preload(url.host):
            port = None if url.port == 80 else url.port
            url = url.copy_with(scheme="https", port=port)
        return url
@florimondmanca florimondmanca added the bug Something isn't working label Jun 16, 2020
@florimondmanca
Copy link
Member

florimondmanca commented Jun 16, 2020

Yup, small standalone script to reproduce:

import asyncio
import httpx


async def app(scope, receive, send):
    await send({"type": "http.response.start", "status": 200})
    await send({"type": "http.response.body", "body": b"Hello, world!"})


async def main():
    async with httpx.AsyncClient(app=app) as client:
        r = await client.get("http://app")
        assert r.request.url == "https://app"


asyncio.run(main())

The meat of it being that:

>>> import hstspreload
>>> hstspreload.in_hsts_preload("app")
True

Note that as per #896 allow_redirects=False won't disable HSTS either, so I'm not aware of a workaround for forcibly disabling HSTS. Maybe we need allow_hsts=...?

@florimondmanca florimondmanca removed the bug Something isn't working label Jun 16, 2020
@florimondmanca
Copy link
Member

florimondmanca commented Jun 16, 2020

Removing the bug label since I'm not convinced we are doing doing anything wrong here.

Maybe hstspreload could skip applying HSTS if the labels is of length 1 (eg http:/:app, http://android, etc), but I'm not sure about it at all (don't have much background about how the HSTS preload list should be used). Any case this should probably discussed over its repo.

@alex-oleshkevich
Copy link
Member Author

@florimondmanca It looks like a bug to me because that rewrite was totally unexpected. As name _GTLD_INCLUDE_SUBDOMAINS states, it is about subdomains and I think httpx should check if the host contains a subdomain and then call hstspreload,

@florimondmanca
Copy link
Member

florimondmanca commented Jun 17, 2020

Not saying this is not buggy, but I'm not sure it'd be us that should do something, or whether hstspreload is being wrong reporting single-label domains (i.e. w/o subdomains) as needing HSTS if they're one of the subdomains in the list.

Should we perhaps reach out over https://github.com/sethmlarson/hstspreload about this? …Actually I might directly cc @sethmlarson if he's got a quick insight about whether this is expected. :-)

@sethmlarson
Copy link
Contributor

sethmlarson commented Jun 17, 2020

HSTS preloading rules don't require an additional label when includeSubdomain is provided. An example is paypal.com which is on the list with includeSubdomain which means both paypal.com and www.paypal.com are covered by HSTS requirements.

You can tell that browsers (in my case Firefox) have something going on with single-label domain names triggering HSTS as well. If you type this in your URL bar:

  • http://app will be transformed to https://app and fail on DNS.
  • http://brussels will retain it's HTTP when DNS fails, Firefox automatically tries a .com, and an HTTP request is sent to brussels.com. This automatic retry doesn't happen for the above case or any HTTPS URL. (.brussels is a valid gTLD)

But you're right there's definitely a "usability" issue here when it comes to things like Docker networks. Excluding single-label domain names from HSTS preload checking almost certainly isn't a security issue as under all normal circumstances they wouldn't resolve to anything on their own. I don't think this is an issue with the hstspreload library.

@florimondmanca
Copy link
Member

Thanks for the clarifications @sethmlarson :)

So I think we can proceed with flagging this as a bug, and the fix is most likely to be to skip HSTS preloading for single-label hostnames (making sure we skip the case of an IP):

I think httpx should check if the host contains a subdomain and then call hstspreload

@florimondmanca florimondmanca added bug Something isn't working user-experience Ensuring that users have a good experience using the library labels Jun 20, 2020
@florimondmanca florimondmanca changed the title Request to http://app:8000 is rewritten to https://app:8000 Skip HSTS preloading for single-label domains Jun 20, 2020
@florimondmanca florimondmanca added the good first issue Good for newcomers label Jul 2, 2020
@frankie567
Copy link
Member

Is there someone tackling this? If not, I would be happy to work on it 🙂

I was thinking of checking if there was at least one dot in the hostname, but it sounds a bit naive to me. I'm open for better suggestions 🙂

@florimondmanca
Copy link
Member

florimondmanca commented Jul 20, 2020

I think this kind of issue should make us reconsider whether enforcing HSTS from a server-side client is a sensible thing to do at all. @tomchristie and I chatted privately about this today. From a UX perspective it can lead to ambiguous behaviors (#896) and cases when you really don't want HSTS preload to kick in (like in this case). Some options are: make HSTS preload opt-in, make it opt-out, drop it entirely (?).

But in the meantime, while this discussion is being held, I guess it's still valuable to have a fix in master for this?

The way I'd have gone about it is to check how many parts the hostname contains, yes. There might be a util in urllib to do this, otherwise let's just len(host.split(".")) > 0?

@frankie567
Copy link
Member

Ok, it sounds also to me that HSTS is quite heavy for a server-side client (from my experience, I always make sure to use the HTTPS URL if available).

Anyway, I'll make this simple fix with a dedicated unit test that'll be easy to remove if needed.

@tomchristie
Copy link
Member

Closed via #1074

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers user-experience Ensuring that users have a good experience using the library
Projects
None yet
Development

No branches or pull requests

5 participants