-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an utility for checking secrets leaks in logs, use it a CI stage at PR level #6191
Comments
We discussed that this should probably be done as a separate CI stage, depending on how long it takes. Making it as a separate app adds some simplicity and some difficulties:
Though it would be nice to detect such issues locally as well, just while running tests. |
Another thought: we can even generate some fake tokens in tests, we don't always need to use real secrets. |
Also, this utility could be used by developers (or users) who wish to share their logs on Github. This is a bit of a rare case, but still a benefit 🙂 |
Slightly off topic, but we also need to eventually define what is considered private data circulating in logs. A wallet account address I consider private data, as I wouldn't want that to be exposed in plain text logs. One idea is to leverage the Go type system as you and I talked about @igor-sirotin. For instance, the Once we identify what is considered private data we can analyse which data types in status-go should be wrapped/modified to be auto-redacted in logs (either always or above a certain log level like the previous example about |
@ilmotta 💯 status-go/internal/security/sensitive_string.go Lines 9 to 16 in ef177c1
And we can surely make more specialized types later as well 🙌 |
Well, I just realised that the "utility" is: cat geth.log | grep <my-secret> 😂 |
I've been discovering this a bit. It is very simple to implement, but we need to agree on the requirements. Goal
Use cases
What to look for?
What exactly to report?Having as much info as possible is best for faster reaction. At the same time, if we run this on CI (which logs are public), we might want to be careful about what this utility will report. I see several levels of redacting data:
I feel that I am overthinking this. Because if the secret was leaked, it doesn't really matter if we print it one more time with the utility, right? 😄 It will only make it a bit harder for potential attacker to find. And of course the real solution is not to use any real secrets on CI. A not about implementation. If we implement our own utility (not just shell script around |
Idea: redact secrets before writing the logsOne thing we could do: intercept the zap logger and simply redact any known secrets before even writing them to the log. This simplifies things, as at runtime we have access to all of the given secrets, we just need to register certain values as secret ones in the logger. If this is a separate utility, we'd need to provide This would be very robust, but might badly affect the performance. wdyt? cc @status-im/status-go-guild @ilmotta |
I lean more on this idea @igor-sirotin because it seems to be the most reliable one. Have we considered implementing the |
This should be done for sure 👍 But it's quite easy to make a mistake here: forget to implement the method, accidentally use the On the other side, processing all strings before writing to file should be more robust. The only thing we need not to forget is to tell the logger about the new secret string.
We already have a custom core, so I think we just need to benchmark and see how bad it is. Might be very bad indeed, but I won't try to estimate without measuring 😄 |
Replying to your comment #6191 (comment) @igor-sirotin:
Depends on the data and our guidelines. For example, if we establish that all address types shouldn't be primitive strings and that they should implement For the critical data (e.g. API keys), I'd imagine we don't have that many and it would be reasonable to wrap in custom types and keep them in check in reviews since they don't change very often. Probably a combination of both approaches can be the best long-term, as you already pointed out. I have to agree with you, applying some heuristics to redact on the final strings would be easier. Not so sure about the safety aspect because I imagine it would be more reliable for us to audit the code to find places not implementing the expected types than it is to hunt strings from the final output (which depend on runtime analysis and vary based on data). But the more I think about the idea of tiny types to get compile-time protection the more I think it's too much for us in status-go to handle, at least for now 🤷🏼 On the topic of benchmarks, for mobile it's complicated because there's a wide range of slow devices that could be using the Status app and they are sensitive to things we take for granted are harmless. That's why we assume on mobile, based on real experience, that the app is already quite slow for some devices. On the desktop client it's more likely to be fine whatever we do here. I was expecting the recent audit to point out a direction about redaction of secret and private values. I see now it's up to us to identify the criticality of different types of data and how/if they should be logged/reported. |
Functional tests provide some secrets to
status-backend
through API (e.g. #6144).When tearing down the test, it could automatically scan the
geth.log
for any leaks of such data. As it knows the values that should not have leaked. This includes:We can also expose this approach to desktop and mobile tests.
The text was updated successfully, but these errors were encountered: