Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of _encode_host #1176

Merged
merged 4 commits into from
Oct 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES/1176.misc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Improved performance of encoding hosts -- by :user:`bdraco`.
29 changes: 14 additions & 15 deletions yarl/_url.py
Original file line number Diff line number Diff line change
Expand Up @@ -997,13 +997,18 @@ def _normalize_path(cls, path: str) -> str:
def _encode_host(
cls, host: str, human: bool = False, validate_host: bool = True
) -> str:
if "%" in host:
raw_ip, sep, zone = host.partition("%")
else:
raw_ip = host
sep = zone = ""

if raw_ip and raw_ip[-1].isdigit() or ":" in raw_ip:
if host and host[-1].isdigit() or ":" in host:
# If the host ends with a digit or contains a colon, its likely
# an IP address. So we check with _ip_compressed_version
# and fall-through if its not an IP address. This is a performance
# optimization to avoid parsing IP addresses as much as possible
# because it is orders of magnitude slower than almost any other
# operation this library does.
if "%" in host:
raw_ip, sep, zone = host.partition("%")
else:
raw_ip = host
sep = zone = ""
# Might be an IP address, check it
#
# IP Addresses can look like:
Expand All @@ -1016,10 +1021,6 @@ def _encode_host(
# Rare IP Address formats are not supported per:
# https://datatracker.ietf.org/doc/html/rfc3986#section-7.4
#
# We try to avoid parsing IP addresses as much as possible
# since its orders of magnitude slower than almost any other operation
# this library does.
#
# IP parsing is slow, so its wrapped in an LRU
try:
ip_compressed_version = _ip_compressed_version(raw_ip)
Expand All @@ -1029,11 +1030,9 @@ def _encode_host(
# These checks should not happen in the
# LRU to keep the cache size small
host, version = ip_compressed_version
if sep:
host += "%" + zone
if version == 6:
return f"[{host}]"
return host
return f"[{host}%{zone}]" if sep else f"[{host}]"
return f"{host}%{zone}" if sep else host

host = host.lower()
if human:
Expand Down
Loading