-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LG-14334 | Don't log redacted values #11158
Conversation
changelog: Internal, Logging, Don't log "[REDACTED]" fields in TMX response since they are useless
There are actually 185 attributes allowed. It's possible that other analytics parameters might still push us to a bit over 200. We could probably pare this list down further, but I would advocate that any such work be done in a followup PR. |
@@ -197,12 +197,7 @@ class ResponseRedacter | |||
def self.redact(hash) | |||
return { error: 'TMx response body was empty' } if hash.nil? | |||
return { error: 'TMx response body was malformed' } unless hash.is_a? Hash | |||
filtered_response_h = hash.slice(*ALLOWED_RESPONSE_FIELDS) | |||
unfiltered_keys = hash.keys - filtered_response_h.keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly had the thought that we could keep this line, and then do something like:
filtered_response_h['redacted_keys'] = unfiltered_keys.join(", ")
I don't see a need for this, though.
filtered_response_h[key] = '[redacted]' | ||
end | ||
filtered_response_h | ||
hash.slice(*ALLOWED_RESPONSE_FIELDS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can read this PR as lines 201-204 being deleted, and then me removing the variable assignment on line 200.
🎫 Ticket
Link to the relevant ticket:
LG-14334
🛠 Summary of changes
Note: This is a discussion about the literal string
[redacted]
being logged. Its use in the following paragraphs is about the literal word "redacted" in brackets, not a representation that information has been redacted.We log an analytics event with the ThreatMetrix response as a hash, which includes several hundred fields—so many that CloudWatch stops parsing them. (It has a limit of 200 fields that it will parse out.) This means that sometimes, important attributes cannot be queried through CloudWatch because they were the >200th field.
A large number of these logged parameters have the literal value "[redacted]", so they are effectively junk. These come from the ResponseRedacter class, which replaces any non-allowlisted attribute with "[redacted]" to guard against accidentally logging PII.
Because the sheer volume of unusable attributes is breaking things, this PR changes the ResponseRedacter to only return the non-redacted attributes. Fields which previously would have had their value replaced with "[redacted]" are now not logged at all. This means that:
ALLOWED_RESPONSE_FIELDS
in the redacter class)