Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spin-rs no longer maintained (dependency) #921

Closed
bluejekyll opened this issue Dec 17, 2019 · 33 comments
Closed

spin-rs no longer maintained (dependency) #921

bluejekyll opened this issue Dec 17, 2019 · 33 comments

Comments

@bluejekyll
Copy link

https://rustsec.org/advisories/RUSTSEC-2019-0031

The spin dependency is no longer maintained it appears. This causes cargo audit to fail on downstream projects. I haven't had a chance to look into potential fixes.

@briansmith
Copy link
Owner

Thanks. I don't think the spinlock design in spin-rs was optimal for what we were using it for anyway. In particular, we really just need a pure spin lock that doesn't ever yield.

@briansmith
Copy link
Owner

In fact, we don't even need a spinlock, really. We just use spin::Once and so we just need a variant of that that is no_std compatible, that is as optimistic as possible that the initialization has already been done.

In particular, we really just need a pure spin lock that doesn't ever yield.

Actually, I just checked my notes and we intentionally don't use a pure spinlock to gracefully handle very edgy edge cases, e.g. the process that grabbed the lock has been suspended while (or just before) the handful of instructions that cache the CPUID results. IIRC, spin-rs was way too eager to yield, but never yielding also isn't right.

@briansmith
Copy link
Owner

I wrote:

we just need a variant of that that is no_std compatible, that is as optimistic as possible that the initialization has already been done.

IIRC, spin-rs was way too eager to yield, but never yielding also isn't right.

If we really want to be optimistic that the initialization has already been done, then spin-rs's behavior of yielding right away makes a lot of sense, actually. So, I think we should just find the simplest thing that will work.

@bluejekyll
Copy link
Author

If we were happen with the Once in spin-rs perhaps we should just fork that and maybe add it to the lock-api project (referenced in the CVE).

@briansmith
Copy link
Owner

If we were happen with the Once in spin-rs perhaps we should just fork that and maybe add it to the lock-api project (referenced in the CVE).

Perhaps. I see lock_api depends on scopeguard only (its other depencencies are optional, it seems).

In general the problem I'm running into now is finding an alternative that doesn't have heavy dependencies. (It looks like rust-lang/rust#56410 and similar long-stalled work has had a chilling effect on development of projects solving these problems.)

@briansmith
Copy link
Owner

I filled rust-lang-nursery/lazy-static.rs#163 against lazy-static.

@briansmith
Copy link
Owner

I put up PR #924 to see if switching to conquer-once would work (pass CI). If so, we could consider switching to that after reviewing the code.

There are two parts to this: cpu.rs and rand.rs.

In cpu.rs, we need to support no_std environments and thus we need a direct replacement to spin-rs, which conquer-once does provide. Notable, the Rust RFC for standardizing a replacement for lazy-static at rust-lang/rfcs#2788 as currently drafted does NOT propose to create something that we could use for cpu.rs.

In rand.rs, we could use once_cell and the variant that is proposed in the RFC mentioned in the previous paragraph.

It would be interesting to know what @oliver-giersch is planning to do with conquer-once if/when the aforementioned RFC is mentioned. I would hope that conquer-once would shrink in scope. But maybe conquer-once does more than the RFC proposed that I'm overlooking.

@oliver-giersch
Copy link

oliver-giersch commented Dec 24, 2019

It would be interesting to know what @oliver-giersch is planning to do with conquer-once if/when the aforementioned RFC is mentioned. I would hope that conquer-once would shrink in scope. But maybe conquer-once does more than the RFC proposed that I'm overlooking.

I primarily intended for conquer-once to be no-std compatible in a way that plays well with the commonly accepted way of structuring cargo features to enable no-std compatibility (which lazy_static does not) and also allow mixing std and no-std crates using conquer-once in the same dependency graph without forcing all of them to use the same blocking strategy (by using separate types for spin-locks and the OS reliant blocking strategy).
Secondly I set out to provide a non-blocking API in addition to the more common blocking one, although this may have been somewhat misguided, but with all the recent async hype it may turn out to be useful to someone someday.

What exactly do you find too extensive about the crate's scope?

@briansmith
Copy link
Owner

Thanks for commenting.

What exactly do you find too extensive about the crate's scope?

As things are now, nothing! But, if/when the lazy_static functionality is in libstd, would that change what conquer-once does or how it works? For example, if libstd were available, it might be nice for conquer-once to defer to the libstd implementation.

@oliver-giersch
Copy link

Thanks for commenting.

What exactly do you find too extensive about the crate's scope?

As things are now, nothing! But, if/when the lazy_static functionality is in libstd, would that change what conquer-once does or how it works? For example, if libstd were available, it might be nice for conquer-once to defer to the libstd implementation.

I might reassess whether the additional non-blocking capabilities are worth maintaining further. But as it is now, the part of the crate that would overlap with the proposed RFC (the OS reliant blocking API) is only compiled if the std feature is enabled, so I might also just downgrade that from a default feature to a purely optional one.
BTW, your comment has prompted me to finally publish version 0.2.0 which contains some soundness fixes as well as some breaking API changes.

@matklad
Copy link

matklad commented Jan 1, 2020

Question 1: is blocking (via spinning) the most theoretically appropriate solution here?

It seems like in both cases:

  • the state we want to initialize fits into an usize
  • the initializing function is idempotent
  • the initializing function is not horribly expensive

So, a non blocking safe implementation is possible, roughly like this:

fn get_state() -> usize {
    static CACHE: AtomicUsize = AtomicUsize::new(0);
    let mut res = CACHE.load(Relaxed);
    if res == 0 { 
        res = init();
        CACHE.store(res, Relaxed);
    }
    res
}

The down side is that the init function might be run more than once.
The up side is that there are no edgy edge cases, when an unfortunate thread gets rescheduled in the middle of the critical section. Additionally, this removes all the cross core synchronization:

  • with Once, if many cores tries to access the data concurrently, they need to agree that only one of them runs initialization, which necessitates cross-core acquire/release memory traffic
  • with the snippet above, each core can run initialization independently, as we only optimistically cache the result with Relaxed (which still guarantees that initilization is run at most number of cores times in the worst case)

Question 2: is a worse edge case possible with interrupts?

Say, we are in single core bare metal context. A hardware thread runs an initialization routine, and is preempted, by the interrupt, in the middle of the critical section. What if the interrupt handler tries to run this initialization again? It will observed that initialization is in progress, will enter a spin wait, and will cause a deadlock.

The proposal from this comment doesn't suffer this problem, the interrupt handler will just run the initialization routine itself.

Question 3:

Actually, I just checked my notes and we intentionally don't use a pure spinlock to gracefully handle very edgy edge cases, e.g. the process that grabbed the lock has been suspended while (or just before) the handful of instructions that cache the CPUID results

Does using Once over spinlock actually helps here? Once internally still basically does a spin lock.

Context: I don't do much no_std development myself, and I'd love to understand the tradeoff related to spin locks better, for the standard lazy types RFC.

@matklad
Copy link

matklad commented Jan 2, 2020

BTW, I've managed to get a reliable reproduction of the edge case for the getrandom crate:

Repository owner deleted a comment Jan 2, 2020
@briansmith
Copy link
Owner

  • the state we want to initialize fits into an usize

This isn't true. We actually rely on the "It is also guaranteed that any memory writes performed by the executed closure can be reliably observed by other threads at this point (there is a happens-before relation between the closure and code executing after the return)." semantics provided by std::sync::Once to ensure all the state is initialized.

And, perhaps this is another bug with using spin-rs: spin-rs's documentation doesn't have the same guarantee! See https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html#method.call_once. We should check whether this is an omission in the documentation of spin-rs or if the spin-rs's Once really only intends to provide a weaker guarantee.

  • the initializing function is idempotent

We don't currently assume idempotence, but something slightly weaker: the winning thread will cache a state that is usable from that point forward.

  • the initializing function is not horribly expensive

The CPUID instruction has severely negative performance consequences when executed. It isn't just slow, but it is also a serializing instruction.

@briansmith
Copy link
Owner

Additionally, this removes all the cross core synchronization

I think that you are overlooking that these stores are executed and require protection on x86 and x86-64 by GFp_cpuid_setup:

  GFp_ia32cap_P[0] = edx;
  GFp_ia32cap_P[1] = ecx;
  GFp_ia32cap_P[2] = extended_features[0];
  GFp_ia32cap_P[3] = extended_features[1];

@briansmith
Copy link
Owner

Context: I don't do much no_std development myself, and I'd love to understand the tradeoff related to spin locks better, for the standard lazy types RFC.

lazy_static! also doesn't guarantee "that any memory writes performed by the executed closure can be reliably observed by other threads at this point" so the standard lazy types RFC probably doesn't want to guarantee that either. The RFC should document this difference from std::sync::Once.

@briansmith
Copy link
Owner

It looks like lazy_static! may also need to be considered unmaintained, even though nobody has declared so yet.

@briansmith
Copy link
Owner

briansmith commented Jan 2, 2020

And, perhaps this is another bug with using spin-rs: spin-rs's documentation doesn't have the same guarantee!

That bug is now tracked as ring issue #931.

@matklad
Copy link

matklad commented Jan 6, 2020

It's instructive to see how they handle this problem in std_detect (runtime feature detection library):

https://github.com/rust-lang/stdarch/blob/master/crates/std_detect/src/detect/cache.rs

Good news: they manage to do it without spinlocks and using only Relaxed operations.
Bad news: the code is wrong (racy) on 32 bit system.

Luckily, the bug is easy to fix (rust-lang/stdarch#837), and yields an interesting insight for ring's case: although the state we want to synchronize is larger than usize and we can't set it atomically, we don't need to! The sate is a bitmask, so it's OK to initialize is word-wise. From what I know so-far, this approach might be the best for ring's use case. However, it does require refactoring the C code to return the mask with flags, instead of globally setting it, and I don't know if that is feasible.

@briansmith
Copy link
Owner

However, it does require refactoring the C code to return the mask with flags, instead of globally setting it, and I don't know if that is feasible.

Yes, we can refactor that code as needed.

@briansmith
Copy link
Owner

Good news: they manage to do it without spinlocks and using only Relaxed operations.

Yes, that is good. Be aware that the readers of the GFp_ia32cap_P and GFp_armcap_P are mostly assembly code that use normal loads, not any atomic loads.

However, it does require refactoring the C code to return the mask with flags, instead of globally setting it, and I don't know if that is feasible.

I think this is because you are hoping to change the definition of GFp_ia32cap_P and GFp_armcap_P to be a Lazy or atomic type, right?

For GFp_ia32cap_P I think we can make them AtomicU32 and rely on the fact that an access from assembly language code, which is just a normal load, and load(Ordering::Relaxed) is a normal load on x86 and x86_64, then I think we could do something here.

For GFp_armcap_P, again it is read by assembly language code, again using normal loads. OTOH, on ARM we already assume a Linux(-like) system so we know libstd is available, so we could use OnceCell unless/until we start supporting a #[no_std] ARM environment and/or do what's needed to the ARM assembly.

WDYT?

@matklad
Copy link

matklad commented Jan 7, 2020

Yes, that is good. Be aware that the readers of the GFp_ia32cap_P and GFp_armcap_P are mostly assembly code that use normal loads, not any atomic loads.

Aha, I haven't realized this, and it indeed makes the problem more complicated, as i don't fully understand interactions between language-level memory model and hardware level memory model.

I think this is because you are hoping to change the definition of GFp_ia32cap_P and GFp_armcap_P to be a Lazy or atomic type, right?

Right. Specifically, I want to change those to be an array of atomic integers. And the only only property of atomics we need is that loads and stores at a single location are atomic, we don't need any additional synchronization requirements. So, it does seem that the trick from stdarch will work on x86 (although I don't think this is officially formally specified to work, atomics + asm are a grey area IIRC).

OTOH, on ARM we already assume a Linux(-like) system so we know libstd is available, so we could use OnceCell unless/until we start supporting a #[no_std] ARM environment and/or do what's needed to the ARM assembly.

I think just using std::sync::Once might be better in this case? As we only want to sidefectfully set a global variable, once_cell ability to hold a value doesn't really help?

As a super-proper and cross-platfom solution, I think we can move all loads of those global variable to Rust, and pass a loaded value in registers to asm. That is, I imagine that asm currently is wrapped in extern "C" functions, and we can add an extra param to those functions, which is a value of GFp_armcap_P. That way, all atomics are visible to the compiler, so it should be guaranteed to do a right thing. Changing asm seems painful though.

@briansmith
Copy link
Owner

Right now x86/x86-64 seems like the only supported platform where we effectively support #[no_std]. I guess that's mostly the UEFI target now. @josephlr What do you use in place of std::sync::Once in the UEFI target?

I'm thinking to release a new major version bump of ring that goes back to using std::sync::Once for this, dropping #[no_std] support until there's a clear alternative.

One thing I've considered: Require no_std environments to explicitly initialize ring, and panic in cpu.rs if the explicit initialization isn't done.

@Razican
Copy link

Razican commented Apr 7, 2020

Hi @briansmith, is there any news on this?

@briansmith
Copy link
Owner

I would love to replace spin with something else that does the same thing, but I do not know of anything that is acceptable and backward-compatible with no-std environments. rust-secure-code/safety-dance#18 and my own personal review of the code indicates that spin-rs's Once code is correct (and, in particular, safe), as far as we can tell.

When we stop maintaining 0.16.x (soon) and start 0.17.x, we'll reconsider doing something different.

In (private) projects that use ring, I'm currently doing this as a workaround:

cargo audit --deny-warnings --ignore=RUSTSEC-2019-0031 

Obviously that's not ideal. That said, it's a "warning" for a reason: the warning about unmaintained crates really doesn't make much sense when you consider all the edge cases.

@CryZe
Copy link

CryZe commented May 4, 2020

Would https://github.com/rust-osdev/spinning_top be a sufficient replacement for spin?

@Ericson2314
Copy link

As someone that contributed lots to spin-rs but doesn't have time to maintain it either, getting everything to use lock_api with type parameters would be a nice way to be no-std compatible without committing to any specific implementation.

@briansmith
Copy link
Owner

@Ericson2314 Thanks. Do you think lock_api has a significant advantage over spinning_top or alternatives?

@CryZe
Copy link

CryZe commented May 21, 2020

Just to clarify, spinning_top is just lock_api with a RawMutex implementation that spins (which I think is close to a straight port of spin-rs) + some type definitions.

@Ericson2314
Copy link

@briansmith Well I like that lock_api is just the interface without the implementation. That vast majority of packages just depend on the locking interface, and are completely agnostic to the implementation, so lock_api is a great fit.

[Do be clear this principle applies to dependencies in general: we could turn every usage of a type into a parameter with a trait, ending up with something like ML's "fully functorized" idiom. But this would be incredibly verbose.]

@Ericson2314
Copy link

The real killer thing would be if we could somehow connect optional dependencies with type parameter defaults, so just about every function in many libraries would have some e.g. allocator and lock implementation type parameters, but in the common std case there would be no need to manually ::<GlobalAlloc, OsLock> wherever those type parameters are ambiguous, and no breaking change from previous releases.

@zesterer
Copy link

Hello! spin-rs is now maintained again.

@ErichDonGubler
Copy link

I filed a PR some time ago updating ring to use conquer-once -- but I can see why it would be obviated by spin-rs becoming maintained again. Feel free to close #1053 if it doesn't make sense anymore.

@briansmith
Copy link
Owner

Thanks everybody. It seems like this is all resolved now. I added a cargo deny step in CI/CD to verify that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants