-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle multiple IP addresses in X-Forwarded-For header in Apache logger #218
Conversation
…oth X-Real-IP and X-Forwarded-For
…r logging The X-Forwarded-For header has a "standard" format: https://en.wikipedia.org/wiki/X-Forwarded-For Here's an example value you may see: ``` X-Forwarded-For: 203.0.113.195, 70.41.3.18, 150.172.238.178 ``` This would be a request that has passed through 3 proxies, where the client's IP address is 203.0.113.195. This commit modifies the Apache logger so that only the first IP address from this X-Forwarded-For list is written to the Apache log.
-- | Pull out the first IP in a comma-separated list of X-Forwarded-For IPs. | ||
-- | ||
-- >>> firstIpInXFF "1.2.3.4, 5.6.7.8, 10.11.12.13" | ||
-- "1.2.3.4" | ||
-- | ||
-- If there are no commas, just return the whole input ByteString: | ||
-- | ||
-- >>> firstIpInXFF "5.6.7.8" | ||
-- "5.6.7.8" | ||
-- | ||
-- Note that this function doesn't make sure the input is actually an IP address: | ||
-- | ||
-- >>> firstIpInXFF "hello, world" | ||
-- "hello" | ||
firstIpInXFF :: ByteString -> ByteString | ||
firstIpInXFF = BS.takeWhile (/= ',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change in how the X-Forwarded-For
header is handled is likely a breaking change, and probably deserves a major version bump. I can do the version bump in this PR if preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments with more examples!
getSourceFromHeaderCustom :: [HeaderName] -> Request -> Maybe ByteString | ||
getSourceFromHeaderCustom hs = getSourceFromHeaders hs | ||
getSourceFromHeaderCustom hs = getSourceFromHeaders (fmap (,id) hs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the firstIpInXFF
here as well? The case would be when you need "x-forwarded-for" but you cannot use getSource because it would read "x-real-ip" first when both are present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
We could use firstIpInXFF
in getSourceFromHeaderCustom
. However, my thought is that it would be somewhat surprising for a user, so I didn't end up adding it.
Really what would be nice is a new data constructor like:
data IPAddrSource
= ...
| FromRequest (Request -> ByteString)
so that the end-user could be in full control of exactly how to figure out the IP address. I didn't choose to add that in this PR, but I could if there is a desire to have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, though ideally I'd prefer a plain function in the spirit of setApacheRequestFilter
and setApacheUserGetter
but this FromRequest would help with many edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about both using firstIpInXFF
in getSourceFromHeaderCustom
AND adding FromRequest
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea, this FromRequest supersedes all the other options, but it may be nice to have a choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cdepillabout I you agree, please push a commit to implement the two things above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kazu-yamamoto I pushed a couple commits, with the important ones being:
- e349e59: Add the
FromRequest (Request -> ByteString)
data constructor. - 1d519c6: Use
firstIpInXFF
ingetSourceFromHeaderCustom
. I made the choice to applyfirstIpInXFF
to all the headers passed in togetSourceFromHeaderCustom
, since it should effectively be a no-op if the header value doesn't have a comma in it. - 2d25632: Bump wai-logger to 2.5.0. (Feel free to drop this commit if you'd prefer to do this yourself in a separate PR or something)
This allows the user full control over exactly how to pull out the IP source address for the Apache logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now LGTM
Merged. |
This PR slightly changes the parsing for the
X-Forwarded-For
header within the Apache logger. In this PR, the list of IPs in the X-Forwarded-For header is parsed and only the left-most IP (most likely to be the actual client IP) is written to the Apache log.Here's an explanation of how the
X-Forwarded-For
header generally works: https://en.wikipedia.org/wiki/X-Forwarded-ForFor instance, if a request goes through 3 proxies, the final Haskell backend might get a request where the
X-Forwarded-For
header looks like the following:In this situation, the actual client IP would be the left-most IP, 203.0.113.195.
Currently in
wai-logger
inmaster
, the value of theX-Forwarded-For
header isn't parsed, so you end up with Apache log lines that look like the following:Because there are multiple comma-separated IP addrs, this format of log line often can't be parsed correctly. (For instance, this exact log line isn't able to be parsed by Vector's
parse_apache_log
function. It also doesn't appear to match the format as specified in the Apache documentation itself.)With the change in this PR, the above Apache log line will now look like the following:
This matches the format most would be expecting.
Here's a more in-depth article about the pit-falls around the
X-Forwarded-For
header: https://adam-p.ca/blog/2022/03/x-forwarded-for/