-
-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
httparse should skip invalid headers #61
Comments
Hm, I'm surprised at the results, but then again, I shouldn't be, the HTTP1 spec has been ignored at every turn :D Curious, do you know what other "libraries" do, like curl or golang? |
I haven't tested this, but it looks like I also have confirmed that
I took a quick look but its code is very dense. The header parsing logic is around It was a good idea to check these libraries, I didn't realize that they would handle things differently. I guess there are 3 behaviors to choose from:
I'll need to think a bit more about which would be better. What do you think? Edit: After thinking about it a bit more, I think that especially since this project is used by apps like |
@seanmonstar Have you had a chance to look into this? I'll start working on a pull request, I just wanted to check if you agree that this is the correct behavior. |
Oh my, I completely forgot about this issue, sorry! I suppose it makes sense to do what browsers do. I have a nagging worry that if ignore invalid headers, is there a point where we're parsing things that really aren't HTTP and saying it is? |
Well if the status line is intact, including HTTP version and all, I think it's definitely supposed to be HTTP. The headers are only parsed if the status line is valid. I'll start making the PR. |
FWIW this behaviour was completely killed from Squid last year. |
@Coder-256 Did you ever make a patch for this? Found another website affected by this.
The hyphens aren't ASCII in the first |
@nox I completely forgot! Also incredible catch, U+2010 strikes again. Anyway, after looking at this further, I am convinced that simply skipping invalid headers is the right option. In this case, Chromium and Firefox both ignore the invalid header while it looks like Safari actually treats the header name as a valid Latin 1-encoded header for whatever reason; none of them fail parsing entirely, as httparse does. I'll open a PR. |
#114 will let you ignore the invalid header. |
@nox Thank you very much, lgtm! Sorry I didn’t get to this sooner. |
@Coder-256 No worries. |
I am trying to use
reqwest
to parse a response from a server I don't control.reqwest
useshyper
which useshttparse
for parsing HTTP/1.x headers. Anyway, this server has a weird bug where it consistently returns a single corrupted header line in an otherwise completely valid response (the header contains unescaped non-token characters). Specifically, for some reason it tries to send theDOCTYPE
as a header. The bug is unlikely to be fixed (this is old software), but it isn't really a problem because the page displays fine in all major browsers.It seems that all major browsers simply ignore invalid header lines. However,
httparse
returns an error that aborts the entire parsing process. IMO this is a problem and should be fixed.Here's a screenshot from Chrome that shows the invalid header being ignored:
In fact, Chrome's behavior is commented as: "skip malformed header".
Although technically changing this could be breaking, in this case, I can't imagine that any code would rely on response parsing to fail in this particular case.
Here are the relevant lines:
httparse/src/lib.rs
Lines 594 to 595 in 6f696f5
httparse/src/lib.rs
Lines 613 to 614 in 6f696f5
httparse/src/lib.rs
Line 672 in 6f696f5
httparse/src/lib.rs
Lines 613 to 614 in 6f696f5
I think all of these would be fixed by consuming
b
until the next newline thencontinue 'headers
. I would open a PR but I just want to check that you agree that this change should be made.The text was updated successfully, but these errors were encountered: