-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: Add ability to merge slashes #7621
Changes from all commits
8af987f
5d4a821
43bd6be
eb0c731
d0430b2
a917d92
7a42b3c
ed6b4f3
bb05f02
09018a7
57fa3b4
958a55a
164cd4b
16c08c4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,8 @@ | |
#include "common/chromium_url/url_canon_stdstring.h" | ||
#include "common/common/logger.h" | ||
|
||
#include "absl/strings/str_join.h" | ||
#include "absl/strings/str_split.h" | ||
#include "absl/strings/string_view.h" | ||
#include "absl/types/optional.h" | ||
|
||
|
@@ -52,5 +54,20 @@ bool PathUtil::canonicalPath(HeaderEntry& path_header) { | |
return true; | ||
} | ||
|
||
void PathUtil::mergeSlashes(HeaderEntry& path_header) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit twitchy about full qualified URLs here. Granted, Envoy currently handles fully qualified urls in the H1 codec by splitting them up, so if the incoming request is GET http://foo.com//bar we'll have :authority foo.com and :path /bar so this is fine today (we won't end up with http:/foo.com/bar). It might be worth an assert that the path is relative, just in case someone does some lua or use defined headers trying to take advantage of Path == url in firstline and having their absolute URL messed up. |
||
const auto original_path = path_header.value().getStringView(); | ||
// Only operate on path component in URL. | ||
const absl::string_view::size_type query_start = original_path.find('?'); | ||
const absl::string_view path = original_path.substr(0, query_start); | ||
const absl::string_view query = absl::ClippedSubstr(original_path, query_start); | ||
if (path.find("//") == absl::string_view::npos) { | ||
return; | ||
} | ||
const absl::string_view prefix = absl::StartsWith(path, "/") ? "/" : absl::string_view(); | ||
const absl::string_view suffix = absl::EndsWith(path, "/") ? "/" : absl::string_view(); | ||
path_header.value(absl::StrCat( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume this is O(n) if we get something whacky like a path which is 16k worth of / yeah? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see anything explicit in the absl documentation, but it's O(n) based on the code. |
||
prefix, absl::StrJoin(absl::StrSplit(path, '/', absl::SkipEmpty()), "/"), query, suffix)); | ||
} | ||
|
||
} // namespace Http | ||
} // namespace Envoy |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,5 +85,27 @@ TEST_F(PathUtilityTest, NormalizeCasePath) { | |
// "/../c\r\n\" '\n' '\r' should be excluded by http parser | ||
// "/a/\0c", '\0' should be excluded by http parser | ||
|
||
// Paths that are valid get normalized. | ||
TEST_F(PathUtilityTest, MergeSlashes) { | ||
auto mergeSlashes = [this](const std::string& path_value) { | ||
auto& path_header = pathHeaderEntry(path_value); | ||
PathUtil::mergeSlashes(path_header); | ||
auto sanitized_path_value = path_header.value().getStringView(); | ||
return std::string(sanitized_path_value); | ||
}; | ||
EXPECT_EQ("", mergeSlashes("")); // empty | ||
EXPECT_EQ("a/b/c", mergeSlashes("a//b/c")); // relative | ||
EXPECT_EQ("/a/b/c/", mergeSlashes("/a//b/c/")); // ends with slash | ||
EXPECT_EQ("a/b/c/", mergeSlashes("a//b/c/")); // relative ends with slash | ||
EXPECT_EQ("/a", mergeSlashes("/a")); // no-op | ||
EXPECT_EQ("/a/b/c", mergeSlashes("//a/b/c")); // double / start | ||
euroelessar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
EXPECT_EQ("/a/b/c", mergeSlashes("/a//b/c")); // double / in the middle | ||
EXPECT_EQ("/a/b/c/", mergeSlashes("/a/b/c//")); // double / end | ||
EXPECT_EQ("/a/b/c", mergeSlashes("/a///b/c")); // triple / in the middle | ||
EXPECT_EQ("/a/b/c", mergeSlashes("/a////b/c")); // quadruple / in the middle | ||
EXPECT_EQ("/a/b?a=///c", mergeSlashes("/a//b?a=///c")); // slashes in the query are ignored | ||
EXPECT_EQ("/a/b?", mergeSlashes("/a//b?")); // empty query | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. include a test method that also ensures that no slash-merging takes place when the option is off. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's covered by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, ok, that makes sense. I think handling of 3 or more slashes should be covered here, though, as this is the functional unit-test. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Test for handling of 3 or more slashes was already added in the latest revision |
||
|
||
} // namespace Http | ||
} // namespace Envoy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably worth noting here that this canonicalization is not mentioned in any HTTP spec, but is offered as an opt-in convenience to upstreams.