mtl-rs

A playground for experimenting with Apple silicon GPUs and metal-rs bindings

Available compute kernels

dotprod (ushort Vs. half impls).
matmul (naive Vs. tiled impls)

Usage

Launching a metal compute kernel is a 2 step process.

# every kernel is self contained i.e. is its own crate. Simply `cd` into a kernel's directory and run
# the following, which compiles our shader to an intermediate representation using the metal utility
xcrun -sdk macosx metal -c ./src/shaders/dotprod.metal -o ./src/shaders/dotprod.air

# next, compile the .air file to generate a .metallib file - which I believe is LLVM IR (need confirmation)
xcrun -sdk macosx metallib ./src/metal/matrixprod.air -o ./src/metal/matrixprod.metallib

# lastly, run the rust binary to launch the kernel and examine its output.
cargo run

Example output for dotprod

cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/mtl`

____*** vector dotprod of 1,000,000 elements of type `ushort` ***___

Dotprod on CPU
      Done in 15.48ms
Dotprod on CPU - parallel
      Done in 3.38ms
Dotprod on GPU
      Actual time spent performing dotprod on GPU
          Done in 1.00ms
      Total time taken - 24.50ms (includes kernel launch and result retreival)

*** verify that all 3 ops produce the same result ***
cpu:     [391, 0, 35, 116, 810], [48, 155, 126, 48, 0]
cpu_par: [391, 0, 35, 116, 810], [48, 155, 126, 48, 0]
gpu:     [391, 0, 35, 116, 810], [48, 155, 126, 48, 0]

____*** vector dotprod of 1,000,000 elements of type `f16` ***___

Dotprod on CPU
      Done in 18.75ms
Dotprod on CPU - parallel
      Done in 3.37ms
Dotprod on GPU
      Actual time spent performing dotprod on GPU
          Done in 869.25µs
      Total time taken - 2.25ms (includes kernel launch and result retreival) # interesting, looks like metal reuses most objects or resources instantiated from the previous dispatch call (i.e. device, queue etc.). 

*** verify that all 3 ops produce the same result ***
cpu:     [0.72802734, 0.0037574768, 0.0947876, 0.16601563, 0.26367188], [0.31933594, 0.2919922, 0.12042236, 0.20458984, 0.30151367]
cpu_par: [0.72802734, 0.0037574768, 0.0947876, 0.16601563, 0.26367188], [0.31933594, 0.2919922, 0.12042236, 0.20458984, 0.30151367]
gpu:     [0.72802734, 0.0037574768, 0.0947876, 0.16601563, 0.26367188], [0.31933594, 0.2919922, 0.12042236, 0.20458984, 0.30151367]

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.vscode		.vscode
dotprod		dotprod
matmul		matmul
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mtl-rs

Available compute kernels

Usage

Example output for dotprod

About

Releases

Packages

Languages

License

nihalpasham/mtl

Folders and files

Latest commit

History

Repository files navigation

mtl-rs

Available compute kernels

Usage

Example output for dotprod

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages