Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rustc directly instead of cargo #52

Merged
merged 1 commit into from
Jan 21, 2021
Merged

Use rustc directly instead of cargo #52

merged 1 commit into from
Jan 21, 2021

Conversation

ojeda
Copy link
Member

@ojeda ojeda commented Dec 9, 2020

This is a big PR, but most of it is interdependent to the rest.

  • Shared Rust infrastructure: libkernel, libmodule, libcore, liballoc, libcompiler_builtins.

    • The Rust modules are now much smaller since they do not contain several copies of those libraries. Our example .ko on release is just 12 KiB, down from 1.3 MiB. For reference (i.e. the bulk is now shared):
      • vmlinux on release with Rust is 23 MiB (compressed: 2.1 MiB)
      • vmlinux on release without Rust is 22 MiB (compressed: 1.9 MiB)
    • Multiple builtin modules are now supported since their symbols do not collide against each other (fixes Fix building several modules as built-in #9).
    • Faster compilation (less crates to compile & less repetition).
    • We achieve this by compiling all the shared code to .rlibs (and the .so for the proc macro). For loadable modules, we need to rely on the upcoming v0 Rust mangling scheme, plus we need to export the Rust symbols needed by the .kos.
  • Simpler, flat file structure: now a small driver may only need a single file like drivers/char/rust_example.rs, like in C.

    • All the rust/* and driver/char/rust_example/* files moved to fit in the new structure: less files around.
  • Only rust-lang/{rust,rust-bindgen,compiler-builtins} as dependencies.

    • Also helps with the faster compilation.
  • Dependency handling integration with Kbuild/fixdep.

    • Changes to the Rust standard library, kernel headers (bindings), rust/ source files, .rs changes, command-line changes, flag changes, etc. all trigger recompilation as needed.
    • Works as expected with parallel support (-j).
  • Support for custom --sysroot via KRUSTCFLAGS.

  • Proper make clean support.

  • Offline builds by default (there is no "online compilation" anymore; fixes Create a way for offline building #17).

  • No interleaved Cargo output (fixes Makefile ui #29).

  • No nightly dependency on Cargo's build-std; since now we manage the cross-compilation ourselves (should fix Compilation error: can't find crate for std; the x86_64-linux-kernel target may not be installed #27).

  • Since now a kernel can be "Rust-enabled", a new CONFIG_RUST option is added to enable/disable it manually, regardless of whether one has rustc available or not (CONFIG_HAS_RUST).

  • Improved handling of rustc flags (opt-level, debuginfo, etc.), following what the user selected for C (no Cargo profiles).

  • Added Kconfig menu for tweaking relevant rustc options, like overflow checks, debug assertions, optimization level, etc.

  • This rewrite of the Kbuild support is cleaner, i.e. less hacks in general handling paths (e.g. no more shell readlink for O=).

  • Duplicated the example driver 3 times so that we can test in the CI that 2 builtins and 2 loadables work, all at the same time.

  • Updated the quick start guide.

  • Updated CI .configs:

    • Add the new options and test with 2 builtins and 2 loadables. At the same time, remove the matrix test for builtin/loadable.
    • Debug: more things enabled (debuginfo, kgdb, unit testing, etc.) that mimic more what a developer would have. Running the CI will be slightly slower, but should be OK.
    • Release: disabled EXPERT and changed a few things to make it look more like a normal configuration.
    • Also update both configs to v5.9 while I was at it.

    (I could have split a few of these ones off into another PR, but anyway it is for the CI only and I had already done it).

  • Less extern crates needed since we pass it via rustc (closer to idiomatic 2018 edition Rust code).

Things to note:

  • There is two more nightly features used:

    • The new Rust mangling scheme: we know it will be stable (and the default on, later on).
    • The binary dep-info output: if we remove all other nightly features, this one can easily go too.
  • The hack at exports.c to export symbols to loadable modules.

  • The hack at allocator.rs to get the __rust_*() functions.

Signed-off-by: Miguel Ojeda [email protected]

@alex
Copy link
Member

alex commented Dec 9, 2020

Looks like this got hit by some merge conflicts :-(

@joshtriplett I'm curious what your view on this is.

(I'll try to get a real review later today)

@ojeda
Copy link
Member Author

ojeda commented Dec 9, 2020

Looks like this got hit by some merge conflicts :-(

You merged #44 right on time! ;-)

Let me solve it, and see if I got all the CI changes correct.

(I'll try to get a real review later today)

No rush! I tried to ease the pain with the description -- (see the "Things to note" section for Rust-related bits).

@ojeda ojeda force-pushed the rust-rustc-kbuild branch 3 times, most recently from 16f6e04 to ac19154 Compare December 9, 2020 17:50
@ojeda
Copy link
Member Author

ojeda commented Dec 9, 2020

Done -- rebased and CI passing.

@ojeda
Copy link
Member Author

ojeda commented Dec 9, 2020

@kloenk After your exams, you may want to take a look to the Kbuild changes for O= (simpler now, shell readlink is gone, etc.), the new output (#29) and whether this offline compilation works for you in Nix (#17).

@kloenk
Copy link
Member

kloenk commented Dec 9, 2020

@kloenk After your exams, you may want to take a look to the Kbuild changes for O= (simpler now, shell readlink is gone, etc.), the new output (#29) and whether this offline compilation works for you in Nix (#17).

Yes, I would like to take a look, but should first concentrate on my exams ;-)

Hopefully have time to check on the weekend.

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
rust/exports.c Outdated Show resolved Hide resolved
rust/kernel/file_operations.rs Show resolved Hide resolved
@wedsonaf
Copy link

wedsonaf commented Dec 9, 2020

I really like the idea of using rustc directly.

I'm getting the following error when building this (without any changes on my side):

  DESCEND  objtool
  CALL    scripts/atomic/check-atomics.sh
  CALL    scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  RUSTC     drivers/char/rust_example.o
  AR      drivers/char/built-in.a
  AR      drivers/built-in.a
  GEN     .version
  CHK     include/generated/compile.h
  UPD     include/generated/compile.h
  CC      init/version.o
  AR      init/built-in.a
  LD      vmlinux.o
  MODPOST vmlinux.symvers
  MODINFO modules.builtin.modinfo
  GEN     modules.builtin
  LD      .tmp_vmlinux.kallsyms1
ld.lld: error: undefined symbol: __rust_alloc_error_handler
>>> referenced by alloc.3a1fbbbh-cgu.0
>>>               ./rust/alloc.o:(_RNvNtCsbDqzXfLQacH_5alloc5alloc18handle_alloc_error)
make: *** [Makefile:1207: vmlinux] Error 1

@ojeda
Copy link
Member Author

ojeda commented Dec 9, 2020

I really like the idea of using rustc directly.

Thanks, and for trying it out too!

I'm getting the following error when building this (without any changes on my side):

Hmm... that's the other magic symbol like __rust_alloc and friends -- for some reason rustc is generating a reference to it. It'd be nice to know from where it comes from (what is your LLVM/rustc/.config?). In any case, it is a good idea to define it (and likely to export it for loadable modules too, plus the rest of the __rust_* ones).

Could you please try to define it in allocator.rs to call our oom() (or just simply panic! directly)? (according to [1] it says it is supposed to call __rg_oom, which I assume then calls the defined handler, which would be our oom()). If that works, I can go ahead and update the PR.

[1] https://github.com/rust-lang/rust/blob/f0f68778f798d6d34649745b41770829b17ba5b8/library/alloc/src/alloc.rs#L340

@wedsonaf
Copy link

wedsonaf commented Dec 9, 2020

Hmm... that's the other magic symbol like __rust_alloc and friends -- for some reason rustc is generating a reference to it. It'd be nice to know from where it comes from (what is your LLVM/rustc/.config?). In any case, it is a good idea to define it (and likely to export it for loadable modules too, plus the rest of the __rust_* ones).

I'm using clang+llvm-11.0.0-x86_64, rustc 1.50.0-nightly (1c389ffef 2020-11-24), and .config is defconfig plus CONFIG_RUST and CONFIG_RUST_EXAMPLE=y.

Could you please try to define it in allocator.rs to call our oom() (or perhaps __rg_oom)? (according to [1] I think that is what it is supposed to do). If that works, I can go ahead and update the PR.

It works if I define it as a C function (extern "C", no mangle). Any idea why you don't get the same?

While experimenting with the function definition here I noticed something else: make is not detecting that it needs to rebuild libkernel.rlib when I modify allocator.rs or lib.rs (I haven't tried other files); I have to delete libkernel.rlib for these files to be rebuilt. Can you try to reproduce this on your side too?

@ojeda
Copy link
Member Author

ojeda commented Dec 10, 2020

It works if I define it as a C function (extern "C", no mangle).

Thanks for testing (although it should work without extern "C" too).

Any idea why you don't get the same?

We have a different rustc version (and both nightlies, and mine is much older) and I assume different sources for the standard library too, so my best bet is that something changed in the standard library that now calls the OOM handler. It should have been the case before anyway since rust_example does a heap allocation (unless the stdlib does SSO or something like that).

While experimenting with the function definition here I noticed something else: make is not detecting that it needs to rebuild libkernel.rlib when I modify allocator.rs or lib.rs (I haven't tried other files); I have to delete libkernel.rlib for these files to be rebuilt. Can you try to reproduce this on your side too?

Sorry, I should have warned you. Yeah, that is expected at the moment, since the *.d files are unused (it is in the TODOs list I wrote above).

@wedsonaf
Copy link

Sorry, I should have warned you. Yeah, that is expected at the moment, since the *.d files are unused (it is in the TODOs list I wrote above).

How hard is it to fix this before merging? I'm afraid not detecting the need to recompile a file will make development on top of this too challenging.

@ojeda
Copy link
Member Author

ojeda commented Dec 11, 2020

How hard is it to fix this before merging? I'm afraid not detecting the need to recompile a file will make development on top of this too challenging.

Let me give it a go and let you know :-)

@ojeda
Copy link
Member Author

ojeda commented Dec 12, 2020

Pushed v2:

  • Dependency handling: everything should trigger a recompilation of the proper pieces (and works in parallel mode, too), in all the following cases:

    • If you change Rust driver code (e.g. drivers/char/rust_example.rs).
    • If you change Rust lib code (e.g. rust/kernel/allocator.rs).
    • If you change a file in the Rust standard library (e.g. you fast-forward that repo or install a new toolchain).
    • If you change a kernel header used by the Rust bindings (e.g. include/linux/sysctl.h).
    • If you change your kernel configuration on something used by the Rust code (e.g. CONFIG_SYSCTL). [*]
  • All steps are now echoed (BINDGEN and RUSTC for the rust/ bits).

  • make clean clears everything now.

  • Symbols are now demangled in exports.c.

  • Improved quick start guide.

[*] This last one relies on the same mechanism as the C side, which means it isn't perfect; but it works well enough (and, in our case, most people working in the Rust side shouldn't be touching too much of the C side at the same time).

Things are more Kbuild-like now, formatting is better, etc. which is also good.

An example of how it looks now:

$ touch .../rust/library/alloc/src/collections/btree/mod.rs
$ make LLVM=1 -j3
  DESCEND  objtool
  CALL    scripts/atomic/check-atomics.sh
  CALL    scripts/checksyscalls.sh
  RUSTC     rust/liballoc.rlib               <-- due to the change (no need to build core, bindings, etc.)
  RUSTC     rust/libkernel.rlib              <-- due to liballoc, serialized
  CHK     include/generated/compile.h
  RUSTC     drivers/char/rust_example.o      <-- due to libkernel
  RUSTC     drivers/char/rust_example_2.o    <-- ditto, parallel
  RUSTC [M] drivers/char/rust_example_3.o    <-- ditto, parallel
  RUSTC [M] drivers/char/rust_example_4.o    <-- ditto, parallel
  AR      drivers/char/built-in.a
...
  MODPOST Module.symvers
  LD [M]  drivers/char/rust_example_3.ko     <-- due to drivers/char/rust_example_3.o
  LD [M]  drivers/char/rust_example_4.ko     <-- due to drivers/char/rust_example_4.o, parallel
  CC      arch/x86/boot/version.o
...

@wedsonaf
Copy link

While trying to use this, I ran into some unexpected cases when libs were not being rebuilt even though I did make changes to them. This is one way to reproduce it:

  1. Build everything without any changes, it should succeed;
  2. Make a change to file_operations.rs that causes it to fail compilation;
  3. Build. It fails as it should.
  4. Build again. It should fail but it succeeds.

Touching other files like miscdev.rs and error.rs have no effect (i.e., the build continues to succeed), but if I touch lib.rs then it fails again (as it should). And now it keeps failing. To get back to the funky state, I need to make the build succeed, then fail. So it looks likes after a successful build, a compilation error in a file other than lib.rs messes up the dependency tracking until lib.rs changes again.

For reference, here is the output of running step 3 for me:

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  RUSTC     rust/libkernel.rlib
error[E0412]: cannot find type `file1` in module `bindings`
     --> rust/kernel/file_operations.rs:15:27
      |
15    |     ptr: *const bindings::file1,
      |                           ^^^^^ help: a struct with a similar name exists: `file`
      |
     ::: /usr/local/google/home/wedsonaf/src/linux-rust/rust/bindings_generated.rs:77235:1
      |
77235 | pub struct file {
      | --------------- similarly named struct `file` defined here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0412`.
fixdep: error opening file: rust/.libkernel.rlib.d: No such file or directory
make[1]: *** [scripts/Makefile.build:403: rust/libkernel.rlib] Error 2
make: *** [Makefile:1244: prepare0] Error 2

And here's the output of step 4, with no changes to file_operations.rs, i.e., it should fail with the same error:

  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#11)

@ojeda
Copy link
Member Author

ojeda commented Dec 14, 2020

While trying to use this, I ran into some unexpected cases when libs were not being rebuilt even though I did make changes to them.

My bad, I never tried breaking changes; I was testing the different scenarios with touch. Thanks for catching that, I'll fix it.

Build again. It should fail but it succeeds.

Yeah... fixdep overwrites the deps file with a "there are no deps" file when the input is missing; which makes make believe there are no dependencies to check (and the target wasn't deleted, so no need to redo it either).

@wedsonaf
Copy link

My bad, I never tried breaking changes; I was testing the different scenarios with touch. Thanks for catching that, I'll fix it.

Thanks!

Another question. I see that you've added debugging symbols, but it doesn't seem to be working when I set CONFIG_DEBUG_INFO=y, I get the following:

(gdb) i fun KernelAllocator
All functions matching regular expression "KernelAllocator":

Non-debugging symbols:
0xffffffff81c72720  _RNvXNtCsbDqzXfLQacH_6kernel9allocatorNtB2_15KernelAllocatorNtNtNtCsbDqzXfLQacH_4core5alloc6global11GlobalAlloc5alloc
0xffffffff81c72740  _RNvXNtCsbDqzXfLQacH_6kernel9allocatorNtB2_15KernelAllocatorNtNtNtCsbDqzXfLQacH_4core5alloc6global11GlobalAlloc7dealloc

Do I need to do anything else to get the right symbols?

When I was experimenting with shoving a staticlib into the kernel build, I was getting proper symbols for rust code. For example, I was getting the following (which is also what I was hoping for this):

(gdb) i fun KernelAllocator
All functions matching regular expression "KernelAllocator":

File src/lib.rs:
49:     static fn <kmodule::KernelAllocator as core::alloc::global::GlobalAlloc>::alloc(*mut kmodule::KernelAllocator, core::alloc::layout::Layout) -> *mut u8;
56:     static fn <kmodule::KernelAllocator as core::alloc::global::GlobalAlloc>::dealloc(*mut kmodule::KernelAllocator, *mut u8, core::alloc::layout::Layout);

@ojeda
Copy link
Member Author

ojeda commented Dec 14, 2020

That's because your GDB doesn't know how to demangle the new mangling scheme (which is used for the exports.c trick). However, don't try just yet to build the latest GDB -- the new scheme support is in GCC but they haven't sync'd them to binutils yet.

@ojeda
Copy link
Member Author

ojeda commented Dec 14, 2020

I guess I can put together a GDB fork with support for it that we can use meanwhile. I will also ask the GDB maintainers if they are planning to import the changes soon (i.e. a week or two).

@wedsonaf
Copy link

That's because your GDB doesn't know how to demangle the new mangling scheme (which is used for the exports.c trick). However, don't try just yet to build the latest GDB -- the new scheme support is in GCC but they haven't sync'd them to binutils yet.

Indeed, without -Z symbol-mangling-version=v0 gdb recognises the symbols. Any chance we could make this optional? For example, with a config option for loadable rust modules -- this way we can choose to get symbols instead of rust loadable modules.

Don't worry about it if it's too much work, as it's not really a regression introduced by this PR.

@ojeda
Copy link
Member Author

ojeda commented Dec 14, 2020

Indeed, without -Z symbol-mangling-version=v0 gdb recognises the symbols. Any chance we could make this optional? For example, with a config option for loadable rust modules -- this way we can choose to get symbols instead of rust loadable modules.

Yeah, I think it is a good idea to add an option to the Rust hacking menu for that. I will add a mention to this to the guide, too.

Don't worry about it if it's too much work, as it's not really a regression introduced by this PR.

AFAIU I only need to sync a few files, so it shouldn't be a big deal.

@ojeda
Copy link
Member Author

ojeda commented Dec 14, 2020

Actually, the debug info comes with demangled symbols. At least, my vanilla 10.1 gdb is able to show the demangled one from DWARF even if it doesn't know to demangle them itself. It seems in your case the first snippet you posted was for the CONFIG_DEBUG_INFO disabled, rather than enabled? (it shows Non-debugging symbols).

Anyway, if you want a version with proper demangling for all cases (e.g. when debug info is disabled) and/or or you want the support in binutils etc., I backported the commit on top of the latest releases:

e.g.

No debug info, vanilla GDB 10.1:

Non-debugging symbols:
0xffffffff81222490  _RNvXNtCsbDqzXfLQacH_4core3fmtQNtNtCsbDqzXfLQacH_6kernel6printk13LogLineWriter...

No debug info, patched GDB 10.1:

Non-debugging symbols:
0xffffffff81222490  <&mut kernel::printk::LogLineWriter as core::fmt::Write>::write_char

Vanilla binutils 2.35.1 (either with or without debug info):

$ nm -C vmlinux | grep LogLine | tail -1
ffffffff81222f90 t _RNvYNtNtCsbDqzXfLQacH_6kernel6printk13LogLineWriterNtNtCsbDqzXfLQacH_...

Patched binutils 2.35.1 (either with or without debug info):

$ nm -C vmlinux | grep LogLine | tail -1
ffffffff81222f90 t <kernel::printk::LogLineWriter as core::fmt::Write>::write_fmt

@ojeda
Copy link
Member Author

ojeda commented Dec 16, 2020

Pushed v3:

  • Improve Kbuild integration:

    • Solve unexpected successful build after failure.
    • Changes in the command line and flags are detected.
    • Removal of generated & target files is detected too (except rust/*.os: I prefer to avoid too-fancy tricks until the kernel bumps the minimum GNU Make version).
  • Add Kconfig option for customizing opt-level (by default, it still follows C's; but now that behavior is explained, plus one can pick any other level for Rust from a list).

  • Add demangling case to the guide, and its solutions.

  • Fix example clone commands in the guide.

@ojeda
Copy link
Member Author

ojeda commented Dec 16, 2020

Rebased after the v5.10 sync merge.

@alex alex removed this from the initial-upstream-required milestone Jan 13, 2021
@ojeda
Copy link
Member Author

ojeda commented Jan 13, 2021

@bjorn3 By the way, in case it is not clear (written technical language can be rough at times), all your comments are welcome and I am enjoying the discussion a lot!

We setup from time to time an informal group call to discuss technical bits "face to face", you are welcome to join us! (also consider subscribing to the ML too).

@bjorn3
Copy link
Member

bjorn3 commented Jan 13, 2021

@bjorn3 By the way, in case it is not clear (written technical language can be rough at times), all your comments are welcome and I am enjoying the discussion a lot!

Thanks for letting me know!

(also consider subscribing to the ML too).

Protonmail doesn't seem to have a way to use plain-text emails instead of MIME without using the paid bridge. I prefer to not use my personal mail though.

@ojeda
Copy link
Member Author

ojeda commented Jan 13, 2021

Protonmail doesn't seem to have a way to use plain-text emails instead of MIME without using the paid bridge. I prefer to not use my personal mail though.

I don't know if those MLs are supposed to work with MIME or not. We could ask to see what can be done; I'd assume you are not the first user hitting that problem. A partial solution is polling the archive (it is public).

@bjorn3
Copy link
Member

bjorn3 commented Jan 13, 2021

I don't know if those MLs are supposed to work with MIME or not.

The subscribe bot interpreted every line of the MIME message as a separate command, so it doesn't work.

We could ask to see what can be done; I'd assume you are not the first user hitting that problem.

From what I gathered, they really don't want you to use MIME messages.

A partial solution is polling the archive (it is public).

👍

@bjorn3
Copy link
Member

bjorn3 commented Jan 13, 2021

A quick question, but is --gc-sections passed to the linker? This removes functions that are not ever called. Especially for the standard library only a subset of all available functions is often used.

@ojeda
Copy link
Member Author

ojeda commented Jan 13, 2021

A quick question, but is --gc-sections passed to the linker? This removes functions that are not ever called. Especially for the standard library only a subset of all available functions is often used.

It is an experimental option so far.

@ojeda
Copy link
Member Author

ojeda commented Jan 19, 2021

Pushed v7:

  • CI

    • The matrix learned 2 more dimensions: rustup/standalone installers and custom --sysroot. This should cover the different alternatives we have in the quick start guide, and ensure things keep working smoothly for NixOS users etc. They are tested separately in a few extra combinations.
    • Subdivide Setup phase into sub-ones for clarity -- a few more steps are skipped now if not required for a given combination.
  • Kbuild

  • rustlib

    • Use the proper prefix for the couple rust_helper_copy* this PR adds.
  • Docs

    • Add "Arch Support" page -- this follows the equivalent table I suggested for LLVM, see https://lore.kernel.org/lkml/[email protected]/
    • Document KRUSTCFLAGS.
    • Add minimum versions to the changes.rst table.
    • Add --recurse-submodules to the guide -- not everything is required, but this is the simplest (the CI optimizes this to save time).
    • Move binutils-gdb URLs to Rust-for-Linux.
    • Change -C option to -Coption etc. to make them look closer to C ones.
    • A handful of other small changes to the guide and a couple sentences in Kconfig.

@kloenk kloenk mentioned this pull request Jan 19, 2021
@ojeda
Copy link
Member Author

ojeda commented Jan 19, 2021

With v7 things should be pretty good now that even more people gave it a look since the last call + the CI is now testing even more variations.

Since already at least a couple people are basing their work on top of this PR, I think I will merge this in a couple days or so to simplify life for everyone and go back to small PRs. There are a handful of things still to improve around this PR, but those can wait.

This is a big PR, but most of it is interdependent to the rest.

  - Shared Rust infrastructure: `libkernel`, `libmodule`, `libcore`,
    `liballoc`, `libcompiler_builtins`.

      + The Rust modules are now much smaller since they do not contain
        several copies of those libraries. Our example `.ko` on release
        is just 12 KiB, down from 1.3 MiB. For reference:

            `vmlinux` on release w/  Rust is 23 MiB (compressed: 2.1 MiB)
            `vmlinux` on release w/o Rust is 22 MiB (compressed: 1.9 MiB)

        i.e. the bulk is now shared.

      + Multiple builtin modules are now supported since their symbols
        do not collide against each other (fixes #9).

      + Faster compilation (less crates to compile & less repetition).

      + We achieve this by compiling all the shared code to `.rlib`s
        (and the `.so` for the proc macro). For loadable modules,
        we need to rely on the upcoming v0 Rust mangling scheme,
        plus we need to export the Rust symbols needed by the `.ko`s.

  - Simpler, flat file structure: now a small driver may only need
    a single file like `drivers/char/rust_example.rs`, like in C.

      + All the `rust/*` and `driver/char/rust_example/*` files moved
        to fit in the new structure: less files around.

  - Only `rust-lang/{rust,rust-bindgen,compiler-builtins}` as dependencies.

      + Also helps with the faster compilation.

  - Dependency handling integration with `Kbuild`/`fixdep`.

      + Changes to the Rust standard library, kernel headers (bindings),
        `rust/` source files, `.rs` changes, command-line changes,
        flag changes, etc. all trigger recompilation as needed.

      + Works as expected with parallel support (`-j`).

  - Automatic generation of the `exports.c` list:

      + Instead of manually handling the list, all non-local functions
        available in `core`, `alloc` and `kernel` are exported, so all
        modules should work, regardless of what they need, and without
        failing linking due to symbols in the manual list not existing
        (e.g. due to differences in config options).

      + They are a lot, though:

          * ~6k Rust symbols vs. ~4k C symbols in release.

          * However, 4k of those are `bindings_raw` (i.e. duplicated C
            ones), which shouldn't be exported. Thus we should look
            into making `bindings_raw` private to the crate (at the
            moment, the (first) Rust example requires
            `<kernel::bindings...::miscdevice as Default>::default`).

      + Licensing:

          * `kernel`'s symbols are exported as GPL.

          * `core`'s and `alloc`'s symbols are exported as non-GPL so
            that third-parties can build Rust modules as long as they
            write their own kernel support infrastructure, i.e. without
            taking advantage of `kernel`. This seemed to make the most
            sense compared to other exports from the kernel, plus it
            follows more closely the original licence of the crates.

  - Support for GCC-compiled kernels:

    + The generated bindings do not have meaningful differences in our
      release config, between GCC 10.1 and Clang 11.

    + Other configs (e.g. our debug one) may add/remove types and functions.
      That is fine unless we use them form our bindings.

    + However, there are config options that may not work (e.g.
      the randstruct GCC plugin if we use one of those structs).

  - Support for `arm64` architecture:

    + Added to the CI: BusyBox is cross-compiled on the fly (increased
      timeout a bit to match).

    + Requires weakening of a few compiler builtins and adding
      `copy_{from,to}_user` helpers.

  - Support for custom `--sysroot` via `KRUSTCFLAGS`.

  - Proper `make clean` support.

  - Offline builds by default (there is no "online compilation" anymore;
    fixes #17).

  - No interleaved Cargo output (fixes #29).

  - No nightly dependency on Cargo's `build-std`; since now we manage
    the cross-compilation ourselves (should fix #27).

  - "Big" kallsyms symbol support:

    + I already raised ksym names from 128 to 256 back when I wrote the first
      integration. However, Rust symbols can be huge in debug/non-optimized,
      so I increased it again to 512; plus the module name from 56 to 248.

    + In turn, this required tuning the table format to support 2-byte lengths
      for ksyms. Compression at generation and kernel decompression is covered,
      although it may be the case that some script/tool also requires changes
      to understand the new table format.

  - Since now a kernel can be "Rust-enabled", a new `CONFIG_RUST` option
    is added to enable/disable it manually, regardless of whether one has
    `rustc` available or not (`CONFIG_HAS_RUST`).

  - Improved handling of `rustc` flags (`opt-level`, `debuginfo`, etc.),
    by default following what the user selected for C, but customizable
    through a Kconfig menu. As well as options for tweaking overflow
    checks and debug assertions.

  - This rewrite of the Kbuild support is cleaner, i.e. less hacks
    in general handling paths (e.g. no more `shell readlink` for `O=`).

  - Duplicated the example driver 3 times so that we can test in the CI
    that 2 builtins and 2 loadables work, all at the same time.

  - Do not export any helpers' symbols.

  - Updated the quick start guide.

  - Updated CI:

      + Now we always test with 2 builtins and 2 loadables Rust example
        drivers, removing the matrix test for builtin/loadable.

      + Added `toolchain` to matrix: now we test building with GCC,
        Clang or a full LLVM toolchain.

      + Added `arch` to matrix: now we test both arm64 and x86_64.

      + Added `rustc` to matrix: now we test with a very recent nightly
        as well.

      + Only build `output == build` once to reduce the number
        of combinations.

      + Debug x86_64 config: more things enabled (debuginfo, kgdb,
        unit testing, etc.) that mimic more what a developer would have.
        Running the CI will be slightly slower, but should be OK.
        Also enable `-C opt-level=0` to test that such an extreme works
        and also to see how much bloated everything becomes.

      + Release x86_64 config: disabled `EXPERT` and changed a few things
        to make it look more like a normal desktop configuration,
        although it is still pretty minimal.

      + The configs for arm64 are `EXPERT` and `EMBEDDED` ones,
        very minimal, for the particular CPU we are simulating.

      + Update configs to v5.10.

      + Use `$GITHUB_ENV` to simplify.

  - Less `extern crate`s needed since we pass it via `rustc`
    (closer to idiomatic 2018 edition Rust code).

Things to note:

  - There is two more nightly features used:

      + The new Rust mangling scheme: we know it will be stable
        (and the default on, later on).

      + The binary dep-info output: if we remove all other nightly
        features, this one can easily go too.

  - The hack at `exports.c` to export symbols to loadable modules.

  - The hack at `allocator.rs` to get the `__rust_*()` functions.

  - The hack to get the proper flags for bindgen on GCC builds.

Signed-off-by: Miguel Ojeda <[email protected]>
@ojeda
Copy link
Member Author

ojeda commented Jan 19, 2021

Pushed a small rewording of the "Arch Support" docs page I forgot to do to avoid someone having to review it twice later.

@ojeda ojeda merged commit 780faf2 into rust Jan 21, 2021
@ojeda ojeda deleted the rust-rustc-kbuild branch January 21, 2021 17:36
ojeda added a commit that referenced this pull request Jan 21, 2021
It allows us to save a bit of space, ignore the duplicate object files,
and the archiving steps; e.g.:

      643544 libcompiler_builtins.rlib
    64171752 libcore.rlib

vs.

      530004 libcompiler_builtins.rmeta
    63679866 libcore.rmeta

We couldn't do it right away in [1] because `rustc` required a fix [2,3].
The fix is now in [4] and available since the 2021-01-21 nightly, so now
we can go ahead and make the change.

Fixes #75.

[1] #52
[2] rust-lang/rust#81117
[3] rust-lang/rust#81118
[4] rust-lang/rust@f9275e1

Suggested-by: bjorn3
Signed-off-by: Miguel Ojeda <[email protected]>
ojeda added a commit that referenced this pull request Jan 21, 2021
It allows us to save a bit of space, ignore the duplicate object files,
and the archiving steps; e.g.:

      643544 libcompiler_builtins.rlib
    64171752 libcore.rlib

vs.

      530004 libcompiler_builtins.rmeta
    63679866 libcore.rmeta

We couldn't do it right away in [1] because `rustc` required a fix [2,3].
The fix is now in [4] and available since the 2021-01-21 nightly, so now
we can go ahead and make the change.

Fixes #75.

[1] #52
[2] rust-lang/rust#81117
[3] rust-lang/rust#81118
[4] rust-lang/rust@f9275e1

Suggested-by: bjorn3
Signed-off-by: Miguel Ojeda <[email protected]>
@ojeda ojeda mentioned this pull request Jan 21, 2021
//
// This requires the Rust's new/future `v0` mangling scheme because the default
// one ("legacy") 1) uses a hash suffix which cannot be predicted across
// compiler versions and 2) uses invalid characters for C identifiers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this, but the v0 mangling scheme is also unpredictable across compiler versions. The symbol hash at the end has been replaced with a crate id near the start of the symbol. For example in _RNvCs21hi0yVfW1J_8rust_out4main the crate id is Cs21hi0yVfW1J.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will remove the first part of the sentence. It is actually a bit misleading, because the "hard" requirement is the second one: mixing rustc compiler versions is not something we will support. IIRC I was thinking about possible patch level fixes to rustc stable versions when I wrote this.

ojeda added a commit to ojeda/linux that referenced this pull request Mar 21, 2021
@bjorn3 helpfully noted that symbols do not stay the same across
`rustc` compiler versions.

The overall sentence is anyway a bit misleading, because the "hard"
requirement is the second one: mixing `rustc` compiler versions
is not something we will support. IIRC I was thinking about possible
patch level fixes to `rustc` stable versions when I wrote that.

Link: Rust-for-Linux#52 (comment)
Signed-off-by: Miguel Ojeda <[email protected]>
@nbdd0121
Copy link
Member

Faster compilation (less crates to compile & less repetition).

This sounds weird to me, as splitting into multiple crates make compilation more parallel. If the amount of Rust code grows I think each driver would probably have to be in their own crate, otherwise compilation time would suffer.

@ojeda
Copy link
Member Author

ojeda commented Apr 22, 2021

That comment was not about parallelization but about having less total code (previously, each driver had the entirety of core, alloc, etc. inside.).

Also, please note it is already the case that each module is its own crate (and they are built in parallel, as usual).

ojeda pushed a commit that referenced this pull request Jan 16, 2023
The inline assembly for arm64's cmpxchg_double*() implementations use a
+Q constraint to hazard against other accesses to the memory location
being exchanged. However, the pointer passed to the constraint is a
pointer to unsigned long, and thus the hazard only applies to the first
8 bytes of the location.

GCC can take advantage of this, assuming that other portions of the
location are unchanged, leading to a number of potential problems.

This is similar to what we fixed back in commit:

  fee960b ("arm64: xchg: hazard against entire exchange variable")

... but we forgot to adjust cmpxchg_double*() similarly at the same
time.

The same problem applies, as demonstrated with the following test:

| struct big {
|         u64 lo, hi;
| } __aligned(128);
|
| unsigned long foo(struct big *b)
| {
|         u64 hi_old, hi_new;
|
|         hi_old = b->hi;
|         cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
|         hi_new = b->hi;
|
|         return hi_old ^ hi_new;
| }

... which GCC 12.1.0 compiles as:

| 0000000000000000 <foo>:
|    0:   d503233f        paciasp
|    4:   aa0003e4        mov     x4, x0
|    8:   1400000e        b       40 <foo+0x40>
|    c:   d2800240        mov     x0, #0x12                       // #18
|   10:   d2800681        mov     x1, #0x34                       // #52
|   14:   aa0003e5        mov     x5, x0
|   18:   aa0103e6        mov     x6, x1
|   1c:   d2800ac2        mov     x2, #0x56                       // #86
|   20:   d2800f03        mov     x3, #0x78                       // #120
|   24:   48207c82        casp    x0, x1, x2, x3, [x4]
|   28:   ca050000        eor     x0, x0, x5
|   2c:   ca060021        eor     x1, x1, x6
|   30:   aa010000        orr     x0, x0, x1
|   34:   d2800000        mov     x0, #0x0                        // #0    <--- BANG
|   38:   d50323bf        autiasp
|   3c:   d65f03c0        ret
|   40:   d2800240        mov     x0, #0x12                       // #18
|   44:   d2800681        mov     x1, #0x34                       // #52
|   48:   d2800ac2        mov     x2, #0x56                       // #86
|   4c:   d2800f03        mov     x3, #0x78                       // #120
|   50:   f9800091        prfm    pstl1strm, [x4]
|   54:   c87f1885        ldxp    x5, x6, [x4]
|   58:   ca0000a5        eor     x5, x5, x0
|   5c:   ca0100c6        eor     x6, x6, x1
|   60:   aa0600a6        orr     x6, x5, x6
|   64:   b5000066        cbnz    x6, 70 <foo+0x70>
|   68:   c8250c82        stxp    w5, x2, x3, [x4]
|   6c:   35ffff45        cbnz    w5, 54 <foo+0x54>
|   70:   d2800000        mov     x0, #0x0                        // #0     <--- BANG
|   74:   d50323bf        autiasp
|   78:   d65f03c0        ret

Notice that at the lines with "BANG" comments, GCC has assumed that the
higher 8 bytes are unchanged by the cmpxchg_double() call, and that
`hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
LL/SC versions of cmpxchg_double().

This patch fixes the issue by passing a pointer to __uint128_t into the
+Q constraint, ensuring that the compiler hazards against the entire 16
bytes being modified.

With this change, GCC 12.1.0 compiles the above test as:

| 0000000000000000 <foo>:
|    0:   f9400407        ldr     x7, [x0, #8]
|    4:   d503233f        paciasp
|    8:   aa0003e4        mov     x4, x0
|    c:   1400000f        b       48 <foo+0x48>
|   10:   d2800240        mov     x0, #0x12                       // #18
|   14:   d2800681        mov     x1, #0x34                       // #52
|   18:   aa0003e5        mov     x5, x0
|   1c:   aa0103e6        mov     x6, x1
|   20:   d2800ac2        mov     x2, #0x56                       // #86
|   24:   d2800f03        mov     x3, #0x78                       // #120
|   28:   48207c82        casp    x0, x1, x2, x3, [x4]
|   2c:   ca050000        eor     x0, x0, x5
|   30:   ca060021        eor     x1, x1, x6
|   34:   aa010000        orr     x0, x0, x1
|   38:   f9400480        ldr     x0, [x4, #8]
|   3c:   d50323bf        autiasp
|   40:   ca0000e0        eor     x0, x7, x0
|   44:   d65f03c0        ret
|   48:   d2800240        mov     x0, #0x12                       // #18
|   4c:   d2800681        mov     x1, #0x34                       // #52
|   50:   d2800ac2        mov     x2, #0x56                       // #86
|   54:   d2800f03        mov     x3, #0x78                       // #120
|   58:   f9800091        prfm    pstl1strm, [x4]
|   5c:   c87f1885        ldxp    x5, x6, [x4]
|   60:   ca0000a5        eor     x5, x5, x0
|   64:   ca0100c6        eor     x6, x6, x1
|   68:   aa0600a6        orr     x6, x5, x6
|   6c:   b5000066        cbnz    x6, 78 <foo+0x78>
|   70:   c8250c82        stxp    w5, x2, x3, [x4]
|   74:   35ffff45        cbnz    w5, 5c <foo+0x5c>
|   78:   f9400480        ldr     x0, [x4, #8]
|   7c:   d50323bf        autiasp
|   80:   ca0000e0        eor     x0, x7, x0
|   84:   d65f03c0        ret

... sampling the high 8 bytes before and after the cmpxchg, and
performing an EOR, as we'd expect.

For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
that linux-4.9.y is oldest currently supported stable release, and
mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
on my machines due to library incompatibilities.

I've also used a standalone test to check that we can use a __uint128_t
pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
3.9.1.

Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double")
Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Reported-by: Boqun Feng <[email protected]>
Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
Reported-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: Arnd Bergmann <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
7 participants