From 1e4b1c4d1a907af559dcf91b697debd60d94bbd7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?L=2E=20K=C3=A4rkk=C3=A4inen?= <98187+Tronic@users.noreply.github.com> Date: Mon, 2 Sep 2019 16:50:56 +0300 Subject: [PATCH] Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2 --- docs/sanic/config.md | 90 ++++++++--- sanic/app.py | 6 + sanic/config.py | 5 +- sanic/headers.py | 146 ++++++++++++++++- sanic/request.py | 108 ++++++------- tests/test_requests.py | 356 ++++++++++++++++++++++++++++++++--------- 6 files changed, 545 insertions(+), 166 deletions(-) diff --git a/docs/sanic/config.md b/docs/sanic/config.md index bd04bbff0e..26ddb4ddcc 100644 --- a/docs/sanic/config.md +++ b/docs/sanic/config.md @@ -110,37 +110,37 @@ Out of the box there are just a few predefined values which can be overwritten w #### `REQUEST_TIMEOUT` -A request timeout measures the duration of time between the instant when a new open TCP connection is passed to the -Sanic backend server, and the instant when the whole HTTP request is received. If the time taken exceeds the -`REQUEST_TIMEOUT` value (in seconds), this is considered a Client Error so Sanic generates an `HTTP 408` response -and sends that to the client. Set this parameter's value higher if your clients routinely pass very large request payloads +A request timeout measures the duration of time between the instant when a new open TCP connection is passed to the +Sanic backend server, and the instant when the whole HTTP request is received. If the time taken exceeds the +`REQUEST_TIMEOUT` value (in seconds), this is considered a Client Error so Sanic generates an `HTTP 408` response +and sends that to the client. Set this parameter's value higher if your clients routinely pass very large request payloads or upload requests very slowly. #### `RESPONSE_TIMEOUT` -A response timeout measures the duration of time between the instant the Sanic server passes the HTTP request to the -Sanic App, and the instant a HTTP response is sent to the client. If the time taken exceeds the `RESPONSE_TIMEOUT` -value (in seconds), this is considered a Server Error so Sanic generates an `HTTP 503` response and sends that to the -client. Set this parameter's value higher if your application is likely to have long-running process that delay the +A response timeout measures the duration of time between the instant the Sanic server passes the HTTP request to the +Sanic App, and the instant a HTTP response is sent to the client. If the time taken exceeds the `RESPONSE_TIMEOUT` +value (in seconds), this is considered a Server Error so Sanic generates an `HTTP 503` response and sends that to the +client. Set this parameter's value higher if your application is likely to have long-running process that delay the generation of a response. #### `KEEP_ALIVE_TIMEOUT` ##### What is Keep Alive? And what does the Keep Alive Timeout value do? -`Keep-Alive` is a HTTP feature introduced in `HTTP 1.1`. When sending a HTTP request, the client (usually a web browser application) -can set a `Keep-Alive` header to indicate the http server (Sanic) to not close the TCP connection after it has send the response. -This allows the client to reuse the existing TCP connection to send subsequent HTTP requests, and ensures more efficient +`Keep-Alive` is a HTTP feature introduced in `HTTP 1.1`. When sending a HTTP request, the client (usually a web browser application) +can set a `Keep-Alive` header to indicate the http server (Sanic) to not close the TCP connection after it has send the response. +This allows the client to reuse the existing TCP connection to send subsequent HTTP requests, and ensures more efficient network traffic for both the client and the server. -The `KEEP_ALIVE` config variable is set to `True` in Sanic by default. If you don't need this feature in your application, -set it to `False` to cause all client connections to close immediately after a response is sent, regardless of +The `KEEP_ALIVE` config variable is set to `True` in Sanic by default. If you don't need this feature in your application, +set it to `False` to cause all client connections to close immediately after a response is sent, regardless of the `Keep-Alive` header on the request. -The amount of time the server holds the TCP connection open is decided by the server itself. -In Sanic, that value is configured using the `KEEP_ALIVE_TIMEOUT` value. By default, it is set to 5 seconds. -This is the same default setting as the Apache HTTP server and is a good balance between allowing enough time for -the client to send a new request, and not holding open too many connections at once. Do not exceed 75 seconds unless +The amount of time the server holds the TCP connection open is decided by the server itself. +In Sanic, that value is configured using the `KEEP_ALIVE_TIMEOUT` value. By default, it is set to 5 seconds. +This is the same default setting as the Apache HTTP server and is a good balance between allowing enough time for +the client to send a new request, and not holding open too many connections at once. Do not exceed 75 seconds unless you know your clients are using a browser which supports TCP connections held open for that long. For reference: @@ -154,16 +154,58 @@ Opera 11 client hard keepalive limit = 120 seconds Chrome 13+ client keepalive limit > 300+ seconds ``` -### About proxy servers and client ip +### Proxy configuration -When you use a reverse proxy server (e.g. nginx), the value of `request.ip` will contain ip of a proxy, typically `127.0.0.1`. To determine the real client ip, `X-Forwarded-For` and `X-Real-IP` HTTP headers are used. But client can fake these headers if they have not been overridden by a proxy. Sanic has a set of options to determine the level of confidence in these headers. +When you use a reverse proxy server (e.g. nginx), the value of `request.ip` will contain ip of a proxy, typically `127.0.0.1`. Sanic may be configured to use proxy headers for determining the true client IP, available as `request.remote_addr`. The full external URL is also constructed from header fields if available. -* If you have a single proxy, set `PROXIES_COUNT` to `1`. Then Sanic will use `X-Real-IP` if available or the last ip from `X-Forwarded-For`. +Without proper precautions, a malicious client may use proxy headers to spoof its own IP. To avoid such issues, Sanic does not use any proxy headers unless explicitly enabled. -* If you have multiple proxies, set `PROXIES_COUNT` equal to their number to allow Sanic to select the correct ip from `X-Forwarded-For`. +Services behind reverse proxies must configure `FORWARDED_SECRET`, `REAL_IP_HEADER` and/or `PROXIES_COUNT`. -* If you don't use a proxy, set `PROXIES_COUNT` to `0` to ignore these headers and prevent ip falsification. +#### Forwarded header -* If you don't use `X-Real-IP` (e.g. your proxy sends only `X-Forwarded-For`), set `REAL_IP_HEADER` to an empty string. +``` +Forwarded: for="1.2.3.4"; proto="https"; host="yoursite.com"; secret="Pr0xy", + for="10.0.0.1"; proto="http"; host="proxy.internal"; by="_1234proxy" +``` + +* Set `FORWARDED_SECRET` to an identifier used by the proxy of interest. + +The secret is used to securely identify a specific proxy server. Given the above header, secret `Pr0xy` would use the information on the first line and secret `_1234proxy` would use the second line. The secret must exactly match the value of `secret` or `by`. A secret in `by` must begin with an underscore and use only characters specified in [RFC 7239 section 6.3](https://tools.ietf.org/html/rfc7239#section-6.3), while `secret` has no such restrictions. + +Sanic ignores any elements without the secret key, and will not even parse the header if no secret is set. + +All other proxy headers are ignored once a trusted forwarded element is found, as it already carries complete information about the client. + +#### Traditional proxy headers + +``` +X-Real-IP: 1.2.3.4 +X-Forwarded-For: 1.2.3.4, 10.0.0.1 +X-Forwarded-Proto: https +X-Forwarded-Host: yoursite.com +``` + +* Set `REAL_IP_HEADER` to `x-real-ip`, `true-client-ip`, `cf-connecting-ip` or other name of such header. +* Set `PROXIES_COUNT` to the number of entries expected in `x-forwarded-for` (name configurable via `FORWARDED_FOR_HEADER`). + +If client IP is found by one of these methods, Sanic uses the following headers for URL parts: + +* `x-forwarded-proto`, `x-forwarded-host`, `x-forwarded-port`, `x-forwarded-path` and if necessary, `x-scheme`. + +#### Proxy config if using ... + +* a proxy that supports `forwarded`: set `FORWARDED_SECRET` to the value that the proxy inserts in the header + * Apache Traffic Server: `CONFIG proxy.config.http.insert_forwarded STRING for|proto|host|by=_secret` + * NGHTTPX: `nghttpx --add-forwarded=for,proto,host,by --forwarded-for=ip --forwarded-by=_secret` + * NGINX: after [the official instructions](https://www.nginx.com/resources/wiki/start/topics/examples/forwarded/), add anywhere in your config: + + proxy_set_header Forwarded "$proxy_add_forwarded;by=\"_$server_name\";proto=$scheme;host=\"$http_host\";path=\"$request_uri\";secret=_secret"; + +* a custom header with client IP: set `REAL_IP_HEADER` to the name of that header +* `x-forwarded-for`: set `PROXIES_COUNT` to `1` for a single proxy, or a greater number to allow Sanic to select the correct IP +* no proxies: no configuration required! + +#### Changes in Sanic 19.9 -The real ip will be available in `request.remote_addr`. If HTTP headers are unavailable or untrusted, `request.remote_addr` will be an empty string; in this case use `request.ip` instead. +Earlier Sanic versions had unsafe default settings. From 19.9 onwards proxy settings must be set manually, and support for negative PROXIES_COUNT has been removed. diff --git a/sanic/app.py b/sanic/app.py index 343ef0cf7e..c2e44a0691 100644 --- a/sanic/app.py +++ b/sanic/app.py @@ -1309,6 +1309,12 @@ def _helper( "stop_event will be removed from future versions.", DeprecationWarning, ) + if self.config.PROXIES_COUNT and self.config.PROXIES_COUNT < 0: + raise ValueError( + "PROXIES_COUNT cannot be negative. " + "https://sanic.readthedocs.io/en/latest/sanic/config.html" + "#proxy-configuration" + ) self.error_handler.debug = debug self.debug = debug diff --git a/sanic/config.py b/sanic/config.py index 42ea762ed0..539ea45c64 100644 --- a/sanic/config.py +++ b/sanic/config.py @@ -26,9 +26,10 @@ "WEBSOCKET_WRITE_LIMIT": 2 ** 16, "GRACEFUL_SHUTDOWN_TIMEOUT": 15.0, # 15 sec "ACCESS_LOG": True, - "PROXIES_COUNT": -1, + "FORWARDED_SECRET": None, + "REAL_IP_HEADER": None, + "PROXIES_COUNT": None, "FORWARDED_FOR_HEADER": "X-Forwarded-For", - "REAL_IP_HEADER": "X-Real-IP", } diff --git a/sanic/headers.py b/sanic/headers.py index 8062b57cb4..6c9fa2215f 100644 --- a/sanic/headers.py +++ b/sanic/headers.py @@ -1,12 +1,20 @@ import re -import typing +from typing import Dict, Iterable, Optional, Tuple +from urllib.parse import unquote -Options = typing.Dict[str, str] # key=value fields in various headers -token, quoted = r"([\w!#$%&'*+\-.^_`|~]+)", r'"([^"]*)"' -parameter = re.compile(fr";\s*{token}=(?:{token}|{quoted})", re.ASCII) -firefox_quote_escape = re.compile(r'\\"(?!; |\s*$)') +Options = Dict[str, str] # key=value fields in various headers +OptionsIterable = Iterable[Tuple[str, str]] # May contain duplicate keys + +_token, _quoted = r"([\w!#$%&'*+\-.^_`|~]+)", r'"([^"]*)"' +_param = re.compile(fr";\s*{_token}=(?:{_token}|{_quoted})", re.ASCII) +_firefox_quote_escape = re.compile(r'\\"(?!; |\s*$)') +_ipv6 = "(?:[0-9A-Fa-f]{0,4}:){2,7}[0-9A-Fa-f]{0,4}" +_ipv6_re = re.compile(_ipv6) +_host_re = re.compile( + r"((?:\[" + _ipv6 + r"\])|[a-zA-Z0-9.\-]{1,253})(?::(\d{1,5}))?" +) # RFC's quoted-pair escapes are mostly ignored by browsers. Chrome, Firefox and # curl all have different escaping, that we try to handle as well as possible, @@ -15,7 +23,7 @@ # For more information, consult ../tests/test_requests.py -def parse_content_header(value: str) -> typing.Tuple[str, Options]: +def parse_content_header(value: str) -> Tuple[str, Options]: """Parse content-type and content-disposition header values. E.g. 'form-data; name=upload; filename=\"file.txt\"' to @@ -24,14 +32,136 @@ def parse_content_header(value: str) -> typing.Tuple[str, Options]: Mostly identical to cgi.parse_header and werkzeug.parse_options_header but runs faster and handles special characters better. Unescapes quotes. """ - value = firefox_quote_escape.sub("%22", value) + value = _firefox_quote_escape.sub("%22", value) pos = value.find(";") if pos == -1: options = {} else: options = { m.group(1).lower(): m.group(2) or m.group(3).replace("%22", '"') - for m in parameter.finditer(value[pos:]) + for m in _param.finditer(value[pos:]) } value = value[:pos] return value.strip().lower(), options + + +# https://tools.ietf.org/html/rfc7230#section-3.2.6 and +# https://tools.ietf.org/html/rfc7239#section-4 +# This regex is for *reversed* strings because that works much faster for +# right-to-left matching than the other way around. Be wary that all things are +# a bit backwards! _rparam matches forwarded pairs alike ";key=value" +_rparam = re.compile(f"(?:{_token}|{_quoted})={_token}\\s*($|[;,])", re.ASCII) + + +def parse_forwarded(headers, config) -> Optional[Options]: + """Parse RFC 7239 Forwarded headers. + The value of `by` or `secret` must match `config.FORWARDED_SECRET` + :return: dict with keys and values, or None if nothing matched + """ + header = headers.getall("forwarded", None) + secret = config.FORWARDED_SECRET + if header is None or not secret: + return None + header = ",".join(header) # Join multiple header lines + if secret not in header: + return None + # Loop over = elements from right to left + sep = pos = None + options = [] + found = False + for m in _rparam.finditer(header[::-1]): + # Start of new element? (on parser skips and non-semicolon right sep) + if m.start() != pos or sep != ";": + # Was the previous element (from right) what we wanted? + if found: + break + # Clear values and parse as new element + del options[:] + pos = m.end() + val_token, val_quoted, key, sep = m.groups() + key = key.lower()[::-1] + val = (val_token or val_quoted.replace('"\\', '"'))[::-1] + options.append((key, val)) + if key in ("secret", "by") and val == secret: + found = True + # Check if we would return on next round, to avoid useless parse + if found and sep != ";": + break + # If secret was found, return the matching options in left-to-right order + return fwd_normalize(reversed(options)) if found else None + + +def parse_xforwarded(headers, config) -> Optional[Options]: + """Parse traditional proxy headers.""" + real_ip_header = config.REAL_IP_HEADER + proxies_count = config.PROXIES_COUNT + addr = real_ip_header and headers.get(real_ip_header) + if not addr and proxies_count: + assert proxies_count > 0 + try: + # Combine, split and filter multiple headers' entries + forwarded_for = headers.getall(config.FORWARDED_FOR_HEADER) + proxies = (p.strip() for h in forwarded_for for p in h.split(",")) + proxies = [p for p in proxies if p] + addr = proxies[-proxies_count] + except (KeyError, IndexError): + pass + # No processing of other headers if no address is found + if not addr: + return None + + def options(): + yield "for", addr + for key, header in ( + ("proto", "x-scheme"), + ("proto", "x-forwarded-proto"), # Overrides X-Scheme if present + ("host", "x-forwarded-host"), + ("port", "x-forwarded-port"), + ("path", "x-forwarded-path"), + ): + yield key, headers.get(header) + + return fwd_normalize(options()) + + +def fwd_normalize(fwd: OptionsIterable) -> Options: + """Normalize and convert values extracted from forwarded headers.""" + ret = {} + for key, val in fwd: + if val is not None: + try: + if key in ("by", "for"): + ret[key] = fwd_normalize_address(val) + elif key in ("host", "proto"): + ret[key] = val.lower() + elif key == "port": + ret[key] = int(val) + elif key == "path": + ret[key] = unquote(val) + else: + ret[key] = val + except ValueError: + pass + return ret + + +def fwd_normalize_address(addr: str) -> str: + """Normalize address fields of proxy headers.""" + if addr == "unknown": + raise ValueError() # omit unknown value identifiers + if addr.startswith("_"): + return addr # do not lower-case obfuscated strings + if _ipv6_re.fullmatch(addr): + addr = f"[{addr}]" # bracket IPv6 + return addr.lower() + + +def parse_host(host: str) -> Tuple[Optional[str], Optional[int]]: + """Split host:port into hostname and port. + :return: None in place of missing elements + """ + m = _host_re.fullmatch(host) + if not m: + return None, None + host, port = m.groups() + return host.lower(), port and int(port) diff --git a/sanic/request.py b/sanic/request.py index 9663063bcd..4f5d0bdce0 100644 --- a/sanic/request.py +++ b/sanic/request.py @@ -11,7 +11,12 @@ from httptools import parse_url from sanic.exceptions import InvalidUsage -from sanic.headers import parse_content_header +from sanic.headers import ( + parse_content_header, + parse_forwarded, + parse_host, + parse_xforwarded, +) from sanic.log import error_logger, logger @@ -87,6 +92,7 @@ class Request(dict): "parsed_files", "parsed_form", "parsed_json", + "parsed_forwarded", "raw_url", "stream", "transport", @@ -107,6 +113,7 @@ def __init__(self, url_bytes, headers, version, method, transport, app): # Init but do not inhale self.body_init() + self.parsed_forwarded = None self.parsed_json = None self.parsed_form = None self.parsed_files = None @@ -371,72 +378,58 @@ def _get_address(self): @property def server_name(self): """ - Attempt to get the server's hostname in this order: - `config.SERVER_NAME`, `x-forwarded-host` header, :func:`Request.host` + Attempt to get the server's external hostname in this order: + `config.SERVER_NAME`, proxied or direct Host headers + :func:`Request.host` :return: the server name without port number :rtype: str """ - return ( - self.app.config.get("SERVER_NAME") - or self.headers.get("x-forwarded-host") - or self.host.split(":")[0] - ) + server_name = self.app.config.get("SERVER_NAME") + if server_name: + host = server_name.split("//", 1)[-1].split("/", 1)[0] + return parse_host(host)[0] + return parse_host(self.host)[0] + + @property + def forwarded(self): + if self.parsed_forwarded is None: + self.parsed_forwarded = ( + parse_forwarded(self.headers, self.app.config) + or parse_xforwarded(self.headers, self.app.config) + or {} + ) + return self.parsed_forwarded @property def server_port(self): """ - Attempt to get the server's port in this order: - `x-forwarded-port` header, :func:`Request.host`, actual port used by - the transport layer socket. + Attempt to get the server's external port number in this order: + `config.SERVER_NAME`, proxied or direct Host headers + :func:`Request.host`, + actual port used by the transport layer socket. :return: server port :rtype: int """ - forwarded_port = self.headers.get("x-forwarded-port") or ( - self.host.split(":")[1] if ":" in self.host else None + if self.forwarded: + return self.forwarded.get("port") or ( + 80 if self.scheme in ("http", "ws") else 443 + ) + return ( + parse_host(self.host)[1] + or self.transport.get_extra_info("sockname")[1] ) - if forwarded_port: - return int(forwarded_port) - else: - port = self.transport.get_extra_info("sockname")[1] - return port @property def remote_addr(self): - """Attempt to return the original client ip based on X-Forwarded-For - or X-Real-IP. If HTTP headers are unavailable or untrusted, returns - an empty string. + """Attempt to return the original client ip based on `forwarded`, + `x-forwarded-for` or `x-real-ip`. If HTTP headers are unavailable or + untrusted, returns an empty string. :return: original client ip. """ if not hasattr(self, "_remote_addr"): - if self.app.config.PROXIES_COUNT == 0: - self._remote_addr = "" - elif self.app.config.REAL_IP_HEADER and self.headers.get( - self.app.config.REAL_IP_HEADER - ): - self._remote_addr = self.headers[ - self.app.config.REAL_IP_HEADER - ] - elif self.app.config.FORWARDED_FOR_HEADER: - forwarded_for = self.headers.get( - self.app.config.FORWARDED_FOR_HEADER, "" - ).split(",") - remote_addrs = [ - addr - for addr in [addr.strip() for addr in forwarded_for] - if addr - ] - if self.app.config.PROXIES_COUNT == -1: - self._remote_addr = remote_addrs[0] - elif len(remote_addrs) >= self.app.config.PROXIES_COUNT: - self._remote_addr = remote_addrs[ - -self.app.config.PROXIES_COUNT - ] - else: - self._remote_addr = "" - else: - self._remote_addr = "" + self._remote_addr = self.forwarded.get("for", "") return self._remote_addr @property @@ -444,14 +437,13 @@ def scheme(self): """ Attempt to get the request scheme. Seeking the value in this order: - `x-forwarded-proto` header, `x-scheme` header, the sanic app itself. + `forwarded` header, `x-forwarded-proto` header, + `x-scheme` header, the sanic app itself. :return: http|https|ws|wss or arbitrary value given by the headers. :rtype: str """ - forwarded_proto = self.headers.get( - "x-forwarded-proto" - ) or self.headers.get("x-scheme") + forwarded_proto = self.forwarded.get("proto") if forwarded_proto: return forwarded_proto @@ -471,12 +463,10 @@ def scheme(self): @property def host(self): """ - :return: the Host specified in the header, may contains port number. + :return: proxied or direct Host header. Hostname and port number may be + separated by sanic.headers.parse_host(request.host). """ - # it appears that httptools doesn't return the host - # so pull it from the headers - - return self.headers.get("Host", "") + return self.forwarded.get("host", self.headers.get("Host", "")) @property def content_type(self): @@ -514,6 +504,10 @@ def url_for(self, view_name, **kwargs): :return: an absolute url to the given view :rtype: str """ + # Full URL SERVER_NAME can only be handled in app.url_for + if "//" in self.app.config.SERVER_NAME: + return self.app.url_for(view_name, _external=True, **kwargs) + scheme = self.scheme host = self.server_name port = self.server_port diff --git a/tests/test_requests.py b/tests/test_requests.py index 4bda1b0def..522d806e92 100644 --- a/tests/test_requests.py +++ b/tests/test_requests.py @@ -401,8 +401,232 @@ async def handler(request): assert response.text == "application/json" +def test_standard_forwarded(app): + @app.route("/") + async def handler(request): + return json(request.forwarded) + + # Without configured FORWARDED_SECRET, x-headers should be respected + app.config.PROXIES_COUNT = 1 + app.config.REAL_IP_HEADER = "x-real-ip" + headers = { + "Forwarded": ( + 'for=1.1.1.1, for=injected;host="' + ', for="[::2]";proto=https;host=me.tld;path="/app/";secret=mySecret' + ',for=broken;;secret=b0rked' + ', for=127.0.0.3;scheme=http;port=1234' + ), + "X-Real-IP": "127.0.0.2", + "X-Forwarded-For": "127.0.1.1", + "X-Scheme": "ws", + } + request, response = app.test_client.get("/", headers=headers) + assert response.json == { "for": "127.0.0.2", "proto": "ws" } + assert request.remote_addr == "127.0.0.2" + assert request.scheme == "ws" + assert request.server_port == 80 + + app.config.FORWARDED_SECRET = "mySecret" + request, response = app.test_client.get("/", headers=headers) + assert response.json == { + "for": "[::2]", + "proto": "https", + "host": "me.tld", + "path": "/app/", + "secret": "mySecret" + } + assert request.remote_addr == "[::2]" + assert request.server_name == "me.tld" + assert request.scheme == "https" + assert request.server_port == 443 + + # Empty Forwarded header -> use X-headers + headers["Forwarded"] = "" + request, response = app.test_client.get("/", headers=headers) + assert response.json == { "for": "127.0.0.2", "proto": "ws" } + + # Header present but not matching anything + request, response = app.test_client.get("/", headers={"Forwarded": "."}) + assert response.json == {} + + # Forwarded header present but no matching secret -> use X-headers + headers = { + "Forwarded": 'for=1.1.1.1;secret=x, for=127.0.0.1', + "X-Real-IP": "127.0.0.2" + } + request, response = app.test_client.get("/", headers=headers) + assert response.json == {"for": "127.0.0.2"} + assert request.remote_addr == "127.0.0.2" + + # Different formatting and hitting both ends of the header + headers = {"Forwarded": 'Secret="mySecret";For=127.0.0.4;Port=1234'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == { + "for": "127.0.0.4", + "port": 1234, + "secret": "mySecret" + } + + # Test escapes (modify this if you see anyone implementing quoted-pairs) + headers = {"Forwarded": 'for=test;quoted="\\,x=x;y=\\";secret=mySecret'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == { + "for": "test", + "quoted": '\\,x=x;y=\\', + "secret": "mySecret" + } + + # Secret insulated by malformed field #1 + headers = {"Forwarded": 'for=test;secret=mySecret;b0rked;proto=wss;'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == {"for": "test", "secret": "mySecret"} + + # Secret insulated by malformed field #2 + headers = {"Forwarded": 'for=test;b0rked;secret=mySecret;proto=wss'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == {"proto": "wss", "secret": "mySecret"} + + # Unexpected termination should not lose existing acceptable values + headers = {"Forwarded": 'b0rked;secret=mySecret;proto=wss'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == {"proto": "wss", "secret": "mySecret"} + + # Field normalization + headers = { + "Forwarded": 'PROTO=WSS;BY="CAFE::8000";FOR=unknown;PORT=X;HOST="A:2";' + 'PATH="/With%20Spaces%22Quoted%22/sanicApp?key=val";SECRET=mySecret' + } + request, response = app.test_client.get("/", headers=headers) + assert response.json == { + "proto": "wss", + "by": "[cafe::8000]", + "host": "a:2", + "path": '/With Spaces"Quoted"/sanicApp?key=val', + "secret": "mySecret", + } + + # Using "by" field as secret + app.config.FORWARDED_SECRET = "_proxySecret" + headers = {"Forwarded": 'for=1.2.3.4; by=_proxySecret'} + request, response = app.test_client.get("/", headers=headers) + assert response.json == {"for": "1.2.3.4", "by": "_proxySecret"} + + +@pytest.mark.asyncio +async def test_standard_forwarded_asgi(app): + @app.route("/") + async def handler(request): + return json(request.forwarded) + + # Without configured FORWARDED_SECRET, x-headers should be respected + app.config.PROXIES_COUNT = 1 + app.config.REAL_IP_HEADER = "x-real-ip" + headers = { + "Forwarded": ( + 'for=1.1.1.1, for=injected;host="' + ', for="[::2]";proto=https;host=me.tld;path="/app/";secret=mySecret' + ',for=broken;;secret=b0rked' + ', for=127.0.0.3;scheme=http;port=1234' + ), + "X-Real-IP": "127.0.0.2", + "X-Forwarded-For": "127.0.1.1", + "X-Scheme": "ws", + } + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { "for": "127.0.0.2", "proto": "ws" } + assert request.remote_addr == "127.0.0.2" + assert request.scheme == "ws" + assert request.server_port == 80 + + app.config.FORWARDED_SECRET = "mySecret" + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { + "for": "[::2]", + "proto": "https", + "host": "me.tld", + "path": "/app/", + "secret": "mySecret" + } + assert request.remote_addr == "[::2]" + assert request.server_name == "me.tld" + assert request.scheme == "https" + assert request.server_port == 443 + + # Empty Forwarded header -> use X-headers + headers["Forwarded"] = "" + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { "for": "127.0.0.2", "proto": "ws" } + + # Header present but not matching anything + request, response = await app.asgi_client.get("/", headers={"Forwarded": "."}) + assert response.json() == {} + + # Forwarded header present but no matching secret -> use X-headers + headers = { + "Forwarded": 'for=1.1.1.1;secret=x, for=127.0.0.1', + "X-Real-IP": "127.0.0.2" + } + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == {"for": "127.0.0.2"} + assert request.remote_addr == "127.0.0.2" + + # Different formatting and hitting both ends of the header + headers = {"Forwarded": 'Secret="mySecret";For=127.0.0.4;Port=1234'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { + "for": "127.0.0.4", + "port": 1234, + "secret": "mySecret" + } + + # Test escapes (modify this if you see anyone implementing quoted-pairs) + headers = {"Forwarded": 'for=test;quoted="\\,x=x;y=\\";secret=mySecret'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { + "for": "test", + "quoted": '\\,x=x;y=\\', + "secret": "mySecret" + } + + # Secret insulated by malformed field #1 + headers = {"Forwarded": 'for=test;secret=mySecret;b0rked;proto=wss;'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == {"for": "test", "secret": "mySecret"} + + # Secret insulated by malformed field #2 + headers = {"Forwarded": 'for=test;b0rked;secret=mySecret;proto=wss'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == {"proto": "wss", "secret": "mySecret"} + + # Unexpected termination should not lose existing acceptable values + headers = {"Forwarded": 'b0rked;secret=mySecret;proto=wss'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == {"proto": "wss", "secret": "mySecret"} + + # Field normalization + headers = { + "Forwarded": 'PROTO=WSS;BY="CAFE::8000";FOR=unknown;PORT=X;HOST="A:2";' + 'PATH="/With%20Spaces%22Quoted%22/sanicApp?key=val";SECRET=mySecret' + } + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == { + "proto": "wss", + "by": "[cafe::8000]", + "host": "a:2", + "path": '/With Spaces"Quoted"/sanicApp?key=val', + "secret": "mySecret", + } + + # Using "by" field as secret + app.config.FORWARDED_SECRET = "_proxySecret" + headers = {"Forwarded": 'for=1.2.3.4; by=_proxySecret'} + request, response = await app.asgi_client.get("/", headers=headers) + assert response.json() == {"for": "1.2.3.4", "by": "_proxySecret"} + + def test_remote_addr_with_two_proxies(app): app.config.PROXIES_COUNT = 2 + app.config.REAL_IP_HEADER = "x-real-ip" @app.route("/") async def handler(request): @@ -443,6 +667,7 @@ async def handler(request): @pytest.mark.asyncio async def test_remote_addr_with_two_proxies_asgi(app): app.config.PROXIES_COUNT = 2 + app.config.REAL_IP_HEADER = "x-real-ip" @app.route("/") async def handler(request): @@ -480,57 +705,6 @@ async def handler(request): assert response.text == "127.0.0.1" -def test_remote_addr_with_infinite_number_of_proxies(app): - app.config.PROXIES_COUNT = -1 - - @app.route("/") - async def handler(request): - return text(request.remote_addr) - - headers = {"X-Real-IP": "127.0.0.2", "X-Forwarded-For": "127.0.1.1"} - request, response = app.test_client.get("/", headers=headers) - assert request.remote_addr == "127.0.0.2" - assert response.text == "127.0.0.2" - - headers = {"X-Forwarded-For": "127.0.1.1"} - request, response = app.test_client.get("/", headers=headers) - assert request.remote_addr == "127.0.1.1" - assert response.text == "127.0.1.1" - - headers = { - "X-Forwarded-For": "127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1" - } - request, response = app.test_client.get("/", headers=headers) - assert request.remote_addr == "127.0.0.5" - assert response.text == "127.0.0.5" - - -@pytest.mark.asyncio -async def test_remote_addr_with_infinite_number_of_proxies_asgi(app): - app.config.PROXIES_COUNT = -1 - - @app.route("/") - async def handler(request): - return text(request.remote_addr) - - headers = {"X-Real-IP": "127.0.0.2", "X-Forwarded-For": "127.0.1.1"} - request, response = await app.asgi_client.get("/", headers=headers) - assert request.remote_addr == "127.0.0.2" - assert response.text == "127.0.0.2" - - headers = {"X-Forwarded-For": "127.0.1.1"} - request, response = await app.asgi_client.get("/", headers=headers) - assert request.remote_addr == "127.0.1.1" - assert response.text == "127.0.1.1" - - headers = { - "X-Forwarded-For": "127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1" - } - request, response = await app.asgi_client.get("/", headers=headers) - assert request.remote_addr == "127.0.0.5" - assert response.text == "127.0.0.5" - - def test_remote_addr_without_proxy(app): app.config.PROXIES_COUNT = 0 @@ -634,15 +808,16 @@ def test_forwarded_scheme(app): async def handler(request): return text(request.remote_addr) + app.config.PROXIES_COUNT = 1 request, response = app.test_client.get("/") assert request.scheme == "http" request, response = app.test_client.get( - "/", headers={"X-Forwarded-Proto": "https"} + "/", headers={"X-Forwarded-For": "127.1.2.3", "X-Forwarded-Proto": "https"} ) assert request.scheme == "https" - request, response = app.test_client.get("/", headers={"X-Scheme": "https"}) + request, response = app.test_client.get("/", headers={"X-Forwarded-For": "127.1.2.3", "X-Scheme": "https"}) assert request.scheme == "https" @@ -1688,9 +1863,19 @@ def handler(request): return text("OK") request, response = app.test_client.get( - "/", headers={"Host": "my_server:5555"} + "/", headers={"Host": "my-server:5555"} ) - assert request.server_name == "my_server" + assert request.server_name == "my-server" + + request, response = app.test_client.get( + "/", headers={"Host": "[2a00:1450:400f:80c::200e]:5555"} + ) + assert request.server_name == "[2a00:1450:400f:80c::200e]" + + request, response = app.test_client.get( + "/", headers={"Host": "mal_formed"} + ) + assert request.server_name == None # For now (later maybe 127.0.0.1) def test_request_server_name_forwarded(app): @@ -1698,11 +1883,12 @@ def test_request_server_name_forwarded(app): def handler(request): return text("OK") + app.config.PROXIES_COUNT = 1 request, response = app.test_client.get( "/", - headers={"Host": "my_server:5555", "X-Forwarded-Host": "your_server"}, + headers={"Host": "my-server:5555", "X-Forwarded-For": "127.1.2.3", "X-Forwarded-Host": "your-server"}, ) - assert request.server_name == "your_server" + assert request.server_name == "your-server" def test_request_server_port(app): @@ -1710,7 +1896,7 @@ def test_request_server_port(app): def handler(request): return text("OK") - request, response = app.test_client.get("/", headers={"Host": "my_server"}) + request, response = app.test_client.get("/", headers={"Host": "my-server"}) assert request.server_port == app.test_client.port @@ -1720,18 +1906,29 @@ def handler(request): return text("OK") request, response = app.test_client.get( - "/", headers={"Host": "my_server:5555"} + "/", headers={"Host": "my-server:5555"} ) assert request.server_port == 5555 + request, response = app.test_client.get( + "/", headers={"Host": "[2a00:1450:400f:80c::200e]:5555"} + ) + assert request.server_port == 5555 + + request, response = app.test_client.get( + "/", headers={"Host": "mal_formed:5555"} + ) + assert request.server_port == app.test_client.port + def test_request_server_port_forwarded(app): @app.get("/") def handler(request): return text("OK") + app.config.PROXIES_COUNT = 1 request, response = app.test_client.get( - "/", headers={"Host": "my_server:5555", "X-Forwarded-Port": "4444"} + "/", headers={"Host": "my-server:5555", "X-Forwarded-For": "127.1.2.3", "X-Forwarded-Port": "4444"} ) assert request.server_port == 4444 @@ -1746,6 +1943,23 @@ async def post(request): assert request.form == {} +def test_server_name_and_url_for(app): + @app.get("/foo") + def handler(request): + return text("ok") + + app.config.SERVER_NAME = "my-server" + assert app.url_for("handler", _external=True) == "http://my-server/foo" + request, response = app.test_client.get("/foo") + assert request.url_for("handler") == f"http://my-server:{app.test_client.port}/foo" + + app.config.SERVER_NAME = "https://my-server/path" + request, response = app.test_client.get("/foo") + url = f"https://my-server/path/foo" + assert app.url_for("handler", _external=True) == url + assert request.url_for("handler") == url + + def test_url_for_with_forwarded_request(app): @app.get("/") def handler(request): @@ -1755,32 +1969,24 @@ def handler(request): def view_name(request): return text("OK") + app.config.SERVER_NAME = "my-server" + app.config.PROXIES_COUNT = 1 request, response = app.test_client.get( - "/", headers={"X-Forwarded-Proto": "https"} - ) - assert app.url_for("view_name") == "/another_view" - assert app.url_for("view_name", _external=True) == "http:///another_view" - assert request.url_for( - "view_name" - ) == "https://127.0.0.1:{}/another_view".format(app.test_client.port) - - app.config.SERVER_NAME = "my_server" - request, response = app.test_client.get( - "/", headers={"X-Forwarded-Proto": "https", "X-Forwarded-Port": "6789"} + "/", headers={"X-Forwarded-For": "127.1.2.3", "X-Forwarded-Proto": "https", "X-Forwarded-Port": "6789"} ) assert app.url_for("view_name") == "/another_view" assert ( app.url_for("view_name", _external=True) - == "http://my_server/another_view" + == "http://my-server/another_view" ) assert ( - request.url_for("view_name") == "https://my_server:6789/another_view" + request.url_for("view_name") == "https://my-server:6789/another_view" ) request, response = app.test_client.get( - "/", headers={"X-Forwarded-Proto": "https", "X-Forwarded-Port": "443"} + "/", headers={"X-Forwarded-For": "127.1.2.3", "X-Forwarded-Proto": "https", "X-Forwarded-Port": "443"} ) - assert request.url_for("view_name") == "https://my_server/another_view" + assert request.url_for("view_name") == "https://my-server/another_view" @pytest.mark.asyncio