-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Linux kernel memory model #348
Comments
I don't think I agree. Or more specifically, to my knowledge, there does not exist something that I would call a Linux kernel memory model -- as in, some sort of definition that takes as input a program source code and produces as output a portable set of allowed behaviors of that program (including whether it has UB). The LKMM that I have seen is a bunch of rules that you have to follow to "get what you want", and is basically defined by "whatever the target hardware does". That works a lot better than I would ever have expected, but I don't see a way to extract from that any kind of operational or even axiomatic semantics. It should be possible in principle to actually prove that a Rust compiler is correct, and that is AFAIK not possible with the LKMM since there is no suitable spec to prove anything against.
I strongly disagree with this one. All those atomic accesses were introduced for a reason. Linux gets away with this only because it pre-dates the C++ model and now it can force GCC to keep supporting whatever it is that it relies on. But I don't even know of a precise definition that one could use to judge whether GCC is a correct compiler for the LKMM or not, other than "ask Paul McKenny". To be clear, what Linux and specifically Paul are doing is quite amazing -- but we have memory models now that can actually be defined formally and provably have properties we want (such as enabling a whole lot of optimizations on non-atomic accesses and allowing efficient compilation to real hardware). The LKMM does not meet these criteria, as far as I know. It is probably broken under various optimizations that the C++ memory model lets us do, and that Rust hence should do for the sake of code that works properly against that model. (For example reordering a release fence followed by a non-atomic write.) |
That is a valid point, but I still think there is some value in what the LKMM provides. Specifically, it should be possible to translate code that uses the LKMM to code that follows Rust’s memory model, and to do so without a performance penalty. Right now, there is no way to do so in at least some cases, such as |
Now, given that Rust is on its way into the Linux kernel, clearly something has to happen. But I feel strongly that that something is not "officially support the LKMM". I think it should be more on the line of what happens in C: begrudgingly accept that Linux is Too Big To Break, and provide best-effort support. This may entail a special flag to change compiler behavior so that we don't have to pessimize programs that use the official model. Or maybe there are other ways. But I think we want to do whatever we can to ensure that Linux remains the only program using this memory model.
I will absolutely accept that the C++ memory model has its flaws. :) And I am not terribly surprised to hear that it has gaps, where some idioms you want to express, cannot be expressed. I am all for better understanding those gaps and working towards fixing them. |
Hi, my name is Boqun Feng, and I'm one of the maintainers of Linux Kernel Memory Model. And since I'm also working on the Rust-for-Linux project, I'm so happy to see efforts on making Rust's memory model work with Linux kernel development. Yes, I agree with @RalfJung, "officially supporting the LKMM" is not something I want from Rust. I'm more interested in whether our existing idioms can be expressed correctly and efficiently (if any case, elegantly ;-)) in Rust, and I expect that in the process something will happen to Rust (introducing new APIs, e.g. Currently, one of the biggest concern I receive from other kernel developers is in Rust reference:
Because they are afraid that Rust still doesn't have the memory (ordering) model... So besides say hi, I also want to ask what I can show to the kernel developers about the status and progress on Rust memory model formalization? My guess is that it will eventually happen in Minirust if I understand the purpose of that project. Btw, I like that project, and I'm going to keep an eye on that project ;-) A few comments below:
Not sure it fulfils your expectation, but we do support to check a small piece of code with herd, you can find the rules of the checking and samples in tools/memory-model of the Linux kernel source.
We do have "AN OPERATIONAL MODEL" section in our tools/memory-model/Documentation/explanation.txt, but I guess it's just Alan's brief idea on the operational model. But given we define the formal check rules in herd, so our operational model should be an extension (or specification) for the operational model used by herd in the paper
Below immediately comes to my mind:
Anyway, I'm sure there are a lot "interesting" places left to discover. Paul also wrote a blog series about "cases and opportunities", and it's worth reading. Just say hi and if there is any question about LKMM or Linux kernel overall, feel free to ask! |
@fbq So a few questions:
|
To elaborate on the last one: A pattern that (IIUC) is common in kernel code is a bunch of ordinary stores followed by a barrier and a final store that publishes everything. The compiler should be free to reorder stores before the barrier, but not to reorder across the barrier. |
Also there is always the read-ahead-lock "optimization" as follow:
for However, in Rust-for-Linux, Rust code needs to work with C code. For example a struct may be defined in C and one of its field is defined as
but all accesses to
Could you provide an example? I'm not sure how a data dependency can be turned into a control dependency.
If the code fits the acquire+release pattern, then better to use acquire+release. And "ordinary stores followed by a publish store" falls into this category. An interesting case is "published by load":
Linux memory model guarantees that if r0 is 0, then r1 == 1 and r2 == 2. Because if There is also a paper describing how to use C/C++ atomics and fences to Linux kernel's atomics and fences. How that is helpful. |
Note I'm not that used to MMIO either, but here is what I know about: Linux kernel has two sets of APIs for MMIO:
There are also four barriers that work with the
All the above are not modeled by LKMM, so their formal semantics is unclear (only informally described in code). IIUC, |
A few points: First, keep in mind that you could simply transliterate the C versions of As you say, there may be room to improve things on both the C and Rust sides, but I think it's important to establish this as a baseline possibility.
You can't. LLVM and GCC are unable to guarantee address dependencies will be respected in the face of arbitrary compiler optimizations. Hence, for C11/C++11 atomics, they fall back to implementing The only guaranteed way to do a dependent read without a barrier would be to write the entire sequence of read followed by dependent read in assembly (they must be together in a single This is all true in both Rust and C. But Linux's C code relies on of address dependencies despite the lack of guarantee, taking advantage of the fact that in practice, GCC and LLVM will only break them in rather pathological cases. Linux's Rust code could do the same, and again, it'll work or not work to the same extent as the C code. A lack of guarantee may be a concern when it comes to safety. Rust normally attempts to guarantee memory safety for safe code even if that code is pathological. And many use cases for atomics rely on correct ordering for memory safety. So you could imagine a fully-safe function calling another function (in another module) which is safe but with unsafe code inside, where the callee is expected to be inlined; the caller function then includes pathological code designed to trick the optimizer into dropping address dependencies in the inlined copy of the callee. But then, the same concern may apply to Rust code that calls C functions, especially if cross-language LTO is enabled (in which case C code can be inlined into Rust). It doesn't seem to me to be worth worrying that much about, but YMMV. |
This seems like something that should probably be solved at the bindgen level and either with annotations in the C code that are understood by bindgen (say, a |
Has there been progress on this on the LLVM/GCC side? There is a reason that Linux behaves as it does, and that is that doing so is critical for best performance on modern hardware. And no, doing the whole thing in assembler is not an option. It is worth noting that for Spectre v1 mitigation, it is actually necessary for the compiler to promote control dependencies to address dependencies in many cases. |
The linux kernel memory model predates C++11. C++11's memory model was designed with compiler optimization in mind, while the linux kernel memory model was designed under the assumption that it is possible to disable certain optimizations in compilers even though neither GCC nor LLVM actually supports this. And even then the linux kernel memory model isn't as efficient as possible on all architectures. Alpha doesn't respect "read-to-read address dependencies" (aka consume atomics still need a fence) while Itanium doesn't guarantee consistent "read-to-read ordering for normal loads from the same variable" according to https://lwn.net/Articles/718628/.
That happens after all optimizations that break address dependencies are already performed. |
That is fair, but I am still not aware of a way to get similar performance without exploiting address dependencies.
Both of these architectures are pretty much dead. |
Let's see if I understand your point ;-) So your point is "whatever you ask from Rust memory model, it must be something supported by LLVM/GCC, because currently you get the same guarantees for C"? If so I must say I expect more: So LKMM describe the conceptual model that kernel developers have for parallel programming and of course I understand there are a lot of fairy tales behind it, and no real world compiler can guarantee everything required by the model. We are lucky because Linux is too big to break (or compilers have mercy on us), so the model can be used ;-) But it's a model describing programmers' ideas and there are tons of examples using them inside kernel source, so why not look into them and see how to support the idioms (not the model itself) without any fairy tale? My (maybe juvenile) idea is while Rust is refining its memory model, some idioms of LKMM can be taken into consideration for support, and sometimes it may mean changing the backend (LLVM/GCC). IIUC the LLVM memory model design today is mostly a result of effects from C++11 memory model, so given Linux kernel is probably one of the biggest projects involved in parallel programming, maybe we get a chance to affect LLVM memory model too ;-) To be clear, by "supporting idioms", I don't mean changing C code lines by lines into Rust, if Rust already have a pattern, we should fit us in. There must be a way that two groups of smart people (language/compiler people and OS kernel developers) can work together instead of one saying "UB" and the other saying "volatile". |
Yeah, that also works. And for Linux kernel, we probably better fix these areas where a |
That's true, we use "stronger than a volatile read" implementation for To be clear, kernel developers (including me) love optimization, in any case, we want to enable the optimization instead of disabling it. I think it's somewhat unfair to say that the Linux kernel memory model is designed under the assumption that we can disable certain optimizations of the compilers. because most of the idioms the LKMM supports come from abstraction of patterns in assembly code: we know x86 provides address dependencies, we know ARM64 provides address dependencies, we know PPC provides address dependencies, etc. So we abstract the idea of address dependencies into a high level. It's simply "if we are going to write similar pattens of assembly code for all architectures, why don't we abstract this pattern into a high-level language?". And abstracting patterns of assembly code is, IIUC, one of the reasons that C was invented and widely used. Disabling the optimization (and assume we can disable) is unfortunately a side effect of this kind of abstraction. If a language memory model can cover the abstraction for these patterns, I think I'm happy to switch to ;-) |
Hi @fbq, nice to hear from you! I am super excited about Rust coming to the Linux kernel, so I very much hope we can find something here that works for both sides. :)
Note that this does not refer to a concurrency memory model, this refers to the sequential part of the memory model. Arguably, C does not have a defined memory model in that sense either -- there is an implicit model hiding in the C spec, but it has not been properly described yet. Some people have tried, but there are still a lot of open questions. Rust has very high standards for what it calls a defined memory model (basically it has to be mathematically unambiguous), which is why we haven't committed to anything official yet. We are working on it though, and we do hope to form an official "operational semantics team" soon that has this as one of its core objectives. :) But for the purpose of this thread, we are talking about the concurrency part of the memory model, and there Rust (for now) just does exactly what C and C++ do. So if you consider the C/C++ concurrency memory models defined, then so is the Rust model.
Thanks for pointing that out, I was not aware of this! This file does look like the heart of a graph-based memory model indeed. Is there any systematic work on comparing this with the C++ model?
Yeah, consume is basically unworkable in any high-level language. To my knowledge nobody has managed to come up with a satisfying definition in any of C, C++, LLVM, or Rust. The very notion of a "data dependency" is meaningless when the compiler is allowed to replace "X * 0" by "0", and thus remove dependencies (to give just a simple example). So, consume-like reasoning can pretty much only be done in inline assembly, and by using the hardware notion of a data dependency.
So the requirement here would be that whatever the Rust side does is ABI compatible with what the C side does. Honestly you are probably best off by defining your own
I surely hope there is such a way. :) But Rust is currently quite far away from attempting to define its own concurrency memory model. We don't have the expertise for that, and we are busy specifying other parts of a language (a we put a higher standard to what we consider a 'specification' than what C/C++ do). And specifically consume has some very fundamental problems where there haven't even been proposals for how to overcome them. As a start, someone would need to figure out how to specify 'consume' without uttering the words 'data dependency'. That seems hard. ;) (I think it will involve some explicit annotations, where you give names to statements and say "ensure that that statement is suitably ordered with that other statement".) So I am curious now what the goal of this thread here is:
Sadly, the C of today is unsuited for this task, and so is Rust. Neither of these languages are 'macro assemblers', even if C was intended to start out as one. The moment the C specification started to talk about an 'abstract machine' and the 'as-if rule', it stopped being a language to abstract over assembly, and started being a high-level language with its own abstract notion of memory. The goals of having strong compiler optimizations and abstracting over assembly patters are fundamentally at odds with each other. It is impossible to have both. Therefore what we do instead is we aim to provide operations that can be described entirely in the abstract, but that lead to the kind of assembly code that is needed for maximal performance. This leads to a lot of complexity such as pointer provenance, a very funny notion of "uninitialized memory" -- and a concurrency memory model that needs to be considered in its own terms, and where thinking of it like "x86 does this, ARM does that, Power does that" quickly leads to incorrect results. Any time a programmer comes and says "but here I have this assembly pattern that I want to generate and I cannot write well-defined C or Rust code that does that", that's a gap in the language that, ideally, should be fixed by extending the abstract language such that the programmer can get what they want and need. Sadly, for consume and data dependencies, nobody has figured out how to do that yet... There has been some interesting work on an alternative approach to relaxed memory, one of my favorites is this paper; there is a talk available here. This calculus is defined in a more abstract way than the C++ memory model, leaving it to the compiler to ensure the required ordering in the most efficient way -- so if the compiler can generate assembly code with a certain data dependency, it can use that and doesn't have to also generate fences or anything like that.
That is what 'relaxed' accesses are designed to allow.
There is a way, it is to use inline assembly. :)
Indeed I think that is probably the best approach. It avoids accidentally limiting how we can evolve the Rust memory model (that underlies our official Moving to the official |
That would work for one or two uses. As you can probably imagine, it will not work for every single use of |
Without knowing the precise details of that function in C, usually in Rust you would write that function as an You can even make it a macro_rules that expands to an asm block if you want each specific use to perform its own dedicated register allocation. |
Likewise ;-) Thanks for the reply. I will leave my reply to the "goal of the thread" to a separate comment because that may need to get more people's attention and opinions.
Thanks for the explanation. I'm all for models with high standards, that's one of the reasons I love about Rust community.
The acacdemic effort of Linux Kernel Memory Model can be found here, and there is a "Comparison with the C11 model" section. However, the comparison is more on different results of litmus tests for two models, rather than models themselves. Plus, LKMM models RCU ordering, which is not part of C++ until recently.
This is one proposal to improve the case. But I don't know whether the committee loves it or not ;-)
I will leave this to the "goal of the thread" comment.
Understood. And good things are worth waiting ;-) I'm simply here to say: if Rust is going to define its own concurrency memory model or refine LLVM's model, please do take Linux kernel parallel idioms into consideration, and if answers from kernel developers are needed, I'm always here ;-)
Totally agreed! The abstract language should be designed to help programmers rather than trick them ;-)
Thanks for the pointer, I will certainly take a look. |
Yes we should absolutely do that! And thanks for the offer. :) I am also excited about the work on the C++ side to support some of the LKMM idioms, and curious what will come out of that. Rust people are closely watching what C++ does in this space. |
I understand that the memory model for the second option is indeed challenging, although I'm happy to see we finally have one, but that requires a lot of smart brains and many years. So I would say let's don't block experimenting Rust for Linux kernel development because of that. However, if we choose the first option, I think I do have some concerns as a developer of the Rust-for-Linux project, so I add other maintainers and reviewers to the discussion: @ojeda @wedsonaf @alex @bjorn3 @nbdd0121 The first option basically means if Rust side code wants to sync with C side code, there will be UBs, and if I understand correctly we actually try very hard to prevent UBs in the Rust-for-Linux project. I'm not sure how others feel about this... If we are going to allow UBs in the Rust-for-Linux project, there is more work to do: we will have to maintain each of them, so that we can remove them once we can. An idea comes to my mind is a unsound!{
// Assuming volatile read data race is safe.
some_ptr.read_volatile(...)
} or we can define Also if we use the
and
, not only we will have to implement them, but also we need to have documentation or guidelines describing their usage, for example:
Anything else we should do? |
Rust should support the Linux kernel memory model in addition to the C/C++ memory model. This means that all of the primitives provided and used by Linux should be able to be implemented efficiently, and that practices used by the Linux kernel (such as using non-atomic reads/writes + barriers instead of atomic operations) should not be UB.
The text was updated successfully, but these errors were encountered: