Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This forum post hints that
URI.decode
is slow, so I checked if that's the case... and it is!It seems that #11124 improved
URI.decode
to take into account writing into an IO that doesn't have the UTF-8 encoding. That's great! But it also made the most usual case, UTF-8, slower, by always creating an intermediateIO::Memory
.The first commits improves that: if it's UTF-8 we don't need to use an intermediate IO (I think! Please correct me if I'm wrong.)
The second commit is a general optimization:
URI.decode
translates%xx
and sometimes a+
to a space, and those are the only sequences that needs a transformation. That also means that if none of those chars (%
and+
) appear in the string, there's no need to decode anything. For theURI.decode(string)
case it means we can directly returnstring
, without allocating any memory at all. For theURI.decode(string, io)
case it means we don't need to go byte by byte: we can directly write the string into the IO.Now, checking whether the
%
or+
chars appear in a string means traversing the entire string, so in theory it adds a tiny overhead when decoding is actually needed, but:memchr
, which is extremely fastHere's a benchmark:
Before:
After:
Notice that the most common case of
URI.decode(string)
where no decoding is needed improves by like 20 times.