Skip to content

Commit

Permalink
asm: add initial infrastructure for an external assembler
Browse files Browse the repository at this point in the history
This change adds some initial logic implementing an external assembler
for Cranelift's x64 backend, as proposed in RFC [bytecodealliance#41].

This adds two crates:
- the `cranelift/assembler/meta` crate defines the instructions; to
  print out the defined instructions use `cargo run -p
  cranelift-assembler-meta`
- the `cranelift/assembler` crate exposes the generated Rust code for
  those instructions; to see the path to the generated code use `cargo
  run -p cranelift-assembler`

The assembler itself is straight-forward enough (module the code
generation, of course); its integration into `cranelift-codegen` is what
is most tricky about this change. Instructions that we will emit in the
new assembler are contained in the `Inst::External` variant. This
unfortunately increases the memory size of `Inst`, but only temporarily
if we end up removing the extra `enum` indirection by adopting the new
assembler wholesale. Another integration point is ISLE: we generate ISLE
definitions and a Rust helper macro to make the external assembler
instructions accessible to ISLE lowering.

This change introduces some duplication: the encoding logic (e.g. for
REX instructions) currently lives both in `cranelift-codegen` and the
new assembler crate. The `Formatter` logic for the assembler `meta`
crate is quite similar to the other `meta` crate. This minimal
duplication felt worth the additional safety provided by the new
assembler.

The `cranelift-assembler` crate is fuzzable (see the `README.md`). It
will generate instructions with randomized operands and compare their
encoding and pretty-printed string to a known-good disassembler,
currently `capstone`. This gives us confidence we previously didn't have
regarding emission. In the future, we may want to think through how to
fuzz (or otherwise check) the integration between `cranelift-codegen`
and this new assembler level.

[bytecodealliance#41]: bytecodealliance/rfcs#41
  • Loading branch information
abrown committed Jan 24, 2025
1 parent 887e5c9 commit 912eca6
Show file tree
Hide file tree
Showing 50 changed files with 3,932 additions and 28 deletions.
26 changes: 26 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ opt-level = 0
resolver = '2'
members = [
"cranelift",
"cranelift/assembler",
"cranelift/assembler/meta",
"cranelift/isle/fuzz",
"cranelift/isle/islec",
"cranelift/isle/veri/veri_engine",
Expand Down
1 change: 1 addition & 0 deletions cranelift/assembler/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
generated.rs
22 changes: 22 additions & 0 deletions cranelift/assembler/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[package]
name = "cranelift-assembler"
description = "A Cranelift-specific x64 assembler"
version = "0.1.0"
edition = "2021"

[dependencies]
arbitrary = { version = "1.3.2", features = ["derive"] }
capstone = "0.12.0"

[dev-dependencies]
arbtest = "0.3.1"

[build-dependencies]
cranelift-assembler-meta = { path = "meta" }

[lints.clippy]
all = "deny"
pedantic = "warn"
module_name_repetitions = { level = "allow", priority = 1 }
similar_names = { level = "allow", priority = 1 }
wildcard_imports = { level = "allow", priority = 1 }
36 changes: 36 additions & 0 deletions cranelift/assembler/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# `cranelift-assembler`

A Cranelift-specific x64 assembler. Unlike the existing `cranelift-codegen`
assembler, this assembler uses instructions, not instruction classes, as the
core abstraction.

### Use

Like `cranelift-codegen`, using this assembler starts with `enum Inst`. For
convenience, a `main.rs` script prints the path to this generated code:

```console
$ cat $(cargo run)
#[derive(arbitrary::Arbitrary, Debug)]
pub enum Inst {
andb_i(andb_i),
andw_i(andw_i),
andl_i(andl_i),
...
```

### Test

In order to check that this assembler emits correct machine code, we fuzz it
against a known-good disassembler. We can run a quick, one-second check:

```console
$ cargo test -- --nocapture
```

Or we can run the fuzzer indefinitely:

```console
$ cargo +nightly fuzz run -s none roundtrip -j16
```

24 changes: 24 additions & 0 deletions cranelift/assembler/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
use cranelift_assembler_meta as meta;
use std::env;
use std::path::Path;

fn main() {
println!("cargo:rerun-if-changed=build.rs");

let out_dir = env::var("OUT_DIR").expect("The OUT_DIR environment variable must be set");
let out_dir = Path::new(&out_dir);
let built_files = [
meta::generate_rust_assembler(out_dir.join("assembler.rs")),
meta::generate_isle_macro(out_dir.join("assembler-isle-macro.rs")),
meta::generate_isle_definitions(out_dir.join("assembler-definitions.isle")),
];

println!(
"cargo:rustc-env=ASSEMBLER_BUILT_FILES={}",
built_files
.iter()
.map(|p| p.to_string_lossy().to_string())
.collect::<Vec<_>>()
.join(":")
);
}
4 changes: 4 additions & 0 deletions cranelift/assembler/fuzz/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
target
corpus
artifacts
coverage
148 changes: 148 additions & 0 deletions cranelift/assembler/fuzz/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions cranelift/assembler/fuzz/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[package]
name = "cranelift-assembler-fuzz"
version = "0.0.0"
publish = false
edition = "2021"

[package.metadata]
cargo-fuzz = true

[dependencies]
libfuzzer-sys = "0.4"

[dependencies.cranelift-assembler]
path = ".."

[[bin]]
name = "roundtrip"
path = "fuzz_targets/roundtrip.rs"
test = false
doc = false
bench = false

[workspace]
8 changes: 8 additions & 0 deletions cranelift/assembler/fuzz/fuzz_targets/roundtrip.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#![no_main]

use cranelift_assembler::{fuzz, Inst};
use libfuzzer_sys::fuzz_target;

fuzz_target!(|inst: Inst| {
fuzz::roundtrip(&inst);
});
5 changes: 5 additions & 0 deletions cranelift/assembler/meta/.rustfmt.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# This extra configuration allows defining extra-long lines in
# `src/instructions`.
fn_call_width = 100
max_width = 110
struct_lit_width = 50
14 changes: 14 additions & 0 deletions cranelift/assembler/meta/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "cranelift-assembler-meta"
description = "Generate a Cranelift-specific assembler for x64 instructions"
version = "0.1.0"
edition = "2021"

[dependencies]

[lints.clippy]
all = "deny"
pedantic = "warn"
enum_glob_use = { level = "allow", priority = 1 }
just_underscores_and_digits = { level = "allow", priority = 1 }
wildcard_imports = { level = "allow", priority = 1 }
Loading

0 comments on commit 912eca6

Please sign in to comment.