-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#[zero_stack] attribute #1853
Comments
See also #1496 |
I wrote something to zero the stack at https://crates.io/crates/clear_on_drop. It is a bit manual (you have to specify how much of the stack should be zeroed), but it should work correctly with panics (the zeroing is done within a |
Nice. Zeroing once might be faster, depending upon your call graph, even if you must zero more, except it'll get slow as soon as you pass the existing cache lines. I wonder if there is any way to measure the memory usage and try not to zero too much? I suppose the LLVM situation has not improved since last year, so all the issues with #1496 still apply here. :( Aside from your clear_on_drop trick with closures, we're likely stuck fixing code to only use the heap. |
One thing that should be noted is this change will have to ensure the zero'ing function doesn't skip the redzone that x64 adds onto the stack for leaf functions. That 128byte area could hold the majority of key in some scenarios. |
There was a talk at RWC about this idea: http://www.cl.cam.ac.uk/~lmrs2/publications/erasing_ram_rwc17.pdf. The main takeaway is that it has to be implemented carefully in order to avoid hurting performance by a considerable amount. Although the RFC that attempted to specify |
We can close this in favor of the original An interesting middle ground might be a type I ran into trouble with my attempts to handle clearable owned data in @cesarb's clear_on_drop crate. Just using |
#1808 does not specify what happens to the data on drop, and I'm positive that LLVM doesn't zero the result of |
There are too many situations where data gets moved around on the stack. In particular, you cannot make a by-value wrapper type that correctly zeros data because Instead, you must manipulate everything behind a pointer, so In principle, immovable types #1858 might fix this by disallowing by-value calls or coercing them into by reference calls, but consider this :
We can coerce According to https://www.cl.cam.ac.uk/~lmrs2/publications/erasing_ram_rwc17.pdf there are much more efficient options than zeroing after every function call anyways. Analyzing the call graph can provide zeroing with minimal cost, but breaks too many things. Appears the best option would be instrumenting functions to track their stack usage. |
It would be bad to rely on immovable types. It's not idiomatic. We need zero-on-move and zero-on-drop types. |
Why? You're both wasting time moving data around and later zeroing it, why not keep it boxed? Besides, without some kind of unbound (i.e.
|
Because you may not have a heap (e.g.
It's up to the compiler to minimize the zeroing overhead during dead store elimination, given knowledge about values that need to be zeroed at least once.
|
This is probably a good motive for using a non-libc-based libstd, like steed. |
I believe most dynamically allocating data structures need a complete redesign anyways, well unless you use them with a second layer of allocation like say
Yes, I'm worried about this for return-by-value, which provides numerous advantages, especially in Rust. It's also so strongly idiomatic that even we avoid it in crypto libraries then users will still pass key material around that way. Right now, the clear_on_drop crate zeros stack pages, but I'd imagine this causes a stack of cache misses due to zeroing memory the program never touched. In the short term, we could probably instrument crypto libraries so that they generated and saved stack usage information when benchmarks were run, and then called some explicit zeroing function at appropriate points. You'd need to run the benchmarks twice before they'd give the right answer, but that's fine. In the longer term, I could imagine a hybrid approach between call graph analysis and instrumentation. Any function that used a dynamically sized
|
People here seem to think zeroing is a major performance hit. It's not. Even if you had a lot of data, you could just encrypt it and zero the key instead. |
Yes, encrypting provides an easy redesign option for allocating data structures, but it's not useful for pass-by-value and return-by-value situations. |
True that. |
Performance is always hinging on the workload. If you have a lot of stack churn, the zeroing may well become noticeable. By that point it's no longer a zero-cost-abstraction: Everyone pays for the privilege of zeroing their data, sensitive or not. |
As I understand Laurent Simon's work, zeroing the secret data needed in cryptography can be cheap compared with the cost of doing the cryptography that uses it, but there are two bad situations one must avoid :
I'll close up this issue now because it takes the wrong approach. We do not want stack types that zero themselves per se. And boxing the secret data ala Instead, we should store intermediate sensitive data in a stack-like way so that the same memory can easily be reused for different mathematical objects, but we need to zero exactly the used stack space when done with sensitive code, and take care with how else this stack gets used. For this, we need instrumentation that tracks exactly how much stack these sensitive functions use, statically if possible, but dynamically when necessary, along with a |
There are various crates that attempt to provide secure memory storage for secret data with perhaps the most advanced being stouset/secrets. A minimal approach is simply to zero any data when dropping it, but mostly these tools employ heap allocation so that rustc and llvm cannot move it, possibly leaking secret data.
Any routines that process secret data must use such an allocator as well, which mildly complicates usage with
#![no_std]
and starts grows suspicious. Are there any initialized parts of the chacha stream cipher left on the stack by a call like this?An alternative would be a
#[zero_stack]
attribute forfn
s that ensured thefn
zeroed any stack space it knowingly touched with program data when thefn
finished. A few details :Any stack allocated with
alloca
#1808 should be zeroed on exit as well.An
fn
marked#[zero_stack]
should never return data on the stack, only in registers. Accepting and using a mutable reference is fine of course, even if that reference points to the stack. In principle, this rule against returning data on the stack could be relaxed if the calling routine is also#[zero_stack]
. An attribute#[no_zero_stack]
might exempt specific data from zeroing, including the value to be returned.An
fn
marked#[zero_stack]
should inline anyfn
s it calls, unless thatfn
is also marked#[zero_stack]
. It should throw a warning anytime it calls a non-#[zero_stack]
fn
that cannot be inlined, although this warning could be overridden with a call site attribute#[no_zero_stack]
.Any nested
fn
or closures should implicitly be made#[zero_stack]
as well, unless marked#[no_zero_stack]
. Any coroutines #1823 marked#[zero_stack]
should zero their own stack when yielding, but some behavior might require the caller by#[zero_stack]
as well.An
fn
marked#[zero_stack]
might zero additional stack space beyond its own usage if there is any OS specific reason to do so. Ideally, a panic might still zero any stack allocated data, but this could be omitted if problematic for whatever reasons.Afaik, there is no reason for attributes like
#[zero_stack]
to try to provide all the functionality of secret data wrapper types, which might even encrypt the data in memory. I think an attribute for zeroing stack allocated data would make using rust to process secure data far more like using rust to process regular data.cc @stouset @tarcieri @valarauca @DaGenix @briansmith @Manishearth @vks @Philipp91
The text was updated successfully, but these errors were encountered: