Add gpu-kernel calling convention #135047

Flakebi · 2025-01-02T22:17:40Z

The amdgpu-kernel calling convention was reverted in commit f6b21e9 (#120495 and rust-lang/rust-analyzer#16463) due to inactivity in the amdgpu target.

Introduce a gpu-kernel calling convention that translates to ptx_kernel or amdgpu_kernel, depending on the target that rust compiles for.

Tracking issue: #135467
amdgpu target tracking issue: #135024

rustbot · 2025-01-02T22:17:48Z

r? @SparrowLii

rustbot has assigned @SparrowLii.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-01-02T22:17:50Z

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

These commits modify compiler targets.
(See the Target Tier Policy.)

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

This PR changes Stable MIR

cc @oli-obk, @celinval, @ouz-a

workingjubilee · 2025-01-03T02:15:08Z

What is the purpose of this calling convention for, @Flakebi? What does it represent?

bjorn3 · 2025-01-03T07:18:27Z

This seems to be missing the actual implementation of the call conv adjustments. As is it is using the extern "C" adjustments for amdgpu instead: https://github.com/rust-lang/rust/blob/master/compiler/rustc_target/src/callconv/amdgpu.rs

Flakebi · 2025-01-03T14:16:29Z

What is the purpose of this calling convention for? What does it represent?

It maps to the amdgpu_kernel calling convention in LLVM.

To run compute kernels on AMD GPUs, one needs to compile a shared ELF object (like a .so). It can contain one or more “kernels” that are functions with the amdgpu_kernel calling convention.
For each amdgpu_kernel function, the amdgpu LLVM backend puts metadata into the ELF and creates a <function name>.kd object in the ELF (I guess kd stands for Kernel Descriptor)⁰.

For running, one can use ROCR-Runtime through the HSA interface or HIP (just learned this yesterday). The first step is to load the ELF (as executable for HSA or module for HIP). Then, query a kernel on the loaded object by symbol name (<function name>).
Last but not least, the kernel can be launched on the GPU with a certain number of executions. When lauching, arguments passed to the kernel can be specified.

So, on a high level, amdgpu-kernel functions are entry-points, like fn main(), but there can be multiple, and the CPU-side can decide which kernel to launch.

^{⁰Rust always uses a linker-script when linking and that hides the .kd symbol from the linked ELF. I plan to fix that in a later PR by adding a .kd symbol to the linker script for all amdgpu-kernel functions.}

workingjubilee · 2025-01-03T21:18:07Z

@Flakebi Then, er, what is the calling convention?

By saying "entry point" and comparing it to fn main(), it sounds a little like you are saying these are only meant to be called externally, but in Rust, recursively calling fn main is perfectly legal because the actual lang_start entrypoint precedes it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=266f24506a382666ade681782cb1ec1c

workingjubilee · 2025-01-03T21:21:49Z

To be clear: I do not believe "there is something LLVM calls a 'calling convention', therefore we must add it as an extern "{abi}"" is sound thinking. There must be a reason beyond that.

I am not trying to suggest there is no other reason, I just am trying to determine "what does exposing this actually accomplish for programmers, and why is this the correct approach, instead of e.g. an attribute?"

Flakebi · 2025-01-03T22:13:09Z

By saying "entry point" and comparing it to fn main(), it sounds a little like you are saying these are only meant to be called externally, but in Rust, recursively calling fn main is perfectly legal because the actual lang_start entrypoint precedes it

Yes, it is meant to be called externally.
It seems that the LLVM backend forbids calls to amdgpu_kernel functions:

define amdgpu_kernel void @kernel() {
  call amdgpu_kernel void @kernel()
  ret void
}

; LLVM ERROR: Unsupported calling convention for call
; Stack dump…

I am not trying to suggest there is no other reason, I just am trying to determine "what does exposing this actually accomplish for programmers, and why is this the correct approach, instead of e.g. an attribute?"

I don’t have much of a reason to expose it as a calling convention except that

The programmer needs to mark kernels in some way
and LLVM uses a calling convention, so it’s easy to expose in Rust (and the calling convention existed before)

An attribute like #[kernel] should work fine as well.
(In LLVM it’s a calling convention because it receives arguments in registers that are initialized by the hardware, which are different from the registers used for “normal” function calls. So, kinda makes sense for LLVM, but it may be different for Rust, as it’s more user-facing.)

I think a wrapping entry point like lang_start wouldn’t work well because the kernel symbol needs to be exposed so that it can be found on the CPU later (actually, I think you always want #[no_mangle] as well).

An example copy kernel looks like this (using the calling convention):

// Compile with
// RUSTFLAGS='-Ctarget-cpu=gfx900' cargo +stage1 build --target amdgcn-amd-amdhsa -Zbuild-std=core

#![allow(internal_features)]
#![feature(link_llvm_intrinsics, abi_amdgpu_kernel)]
#![no_std]

unsafe extern "C" {
    #[link_name = "llvm.amdgcn.workitem.id.x"]
    safe fn workitem_x() -> u32;
    #[link_name = "llvm.amdgcn.s.sethalt"]
    safe fn halt(i: u32) -> !;
}

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    halt(1);
}

#[no_mangle]
pub extern "amdgpu-kernel" fn kernel(input: *const u8, output: *mut u8) {
    let id = workitem_x(); // This is the execution id from 0 to <size> (size is given on the CPU when launching the kernel)
    unsafe {
        *output.add(id as usize) = *input.add(id as usize);
    }
}

workingjubilee · 2025-01-03T22:43:18Z

cool!

are ptx-kernel functions also uncallable from the device, only the host?

Flakebi · 2025-01-03T23:40:23Z

Seems like ptx-kernel does not have that restriction and can call itself

The following compiles:

#[no_mangle]
pub unsafe extern "ptx-kernel" fn global_function(i: i32) {
    if i == 0 {
        global_function(i - 1);
    }
}

(finally managed to compile nvptx with --target nvptx64-nvidia-cuda -Ctarget-cpu=sm_86 -Zunstable-options -Clinker-flavor=llbc)

workingjubilee · 2025-01-04T00:31:17Z

weird!

workingjubilee · 2025-01-04T16:49:55Z

Looking at https://reviews.llvm.org/D140226#4004026 I'm not so sure that these aren't meant to behave identically and it's just lack of interest in implementing the followup steps that made them distinct.

workingjubilee · 2025-01-04T17:41:49Z

Opened llvm/llvm-project#121655

Flakebi · 2025-01-06T11:27:51Z

Interesting find!
I guess then it makes sense to implement a gpu-kernel calling convention that translates to nvptx_kernel or amdgpu_kernel, depending on the target. nvptx-kernel could then be deprecated.

cc @RDambrosio016, @kjetilkjeka

bjorn3 · 2025-01-06T11:31:56Z

Do nvptx and amdgpu allow the same kinds of arguments to the kernel function with the same restrictions?

kjetilkjeka · 2025-01-06T18:23:27Z

On the issue of whether ptx-kernels should be directly callable I'm under the impression that they should not. I don't have edit rights but prohibiting calling kernels directly is in my suggestion for update to the tracking issue for ptx-kernel

Even if the LLVM project emits an error on this, we should dedicate an Error message to not have users running into LLVM errors and ICEs. The error messages could at least be common between the different calling conventions.

I guess then it makes sense to implement a gpu-kernel calling convention that translates to nvptx_kernel or amdgpu_kernel, depending on the target. nvptx-kernel could then be deprecated.

I'm not intimately familiar with the restrictions of amdgpu kernels so I don't really know if it can be modelled as "the same thing". I like the idea of the "unification" if this is without drawbacks.

I also think that there can be a case for having a "Rust kernel" calling convention where it's possible to pass enums, tuples, etc from Rust host code to Rust device code. Perhaps that can also be common between the different formats for describing GPU code (ptx, amdgcn, spir-v) calling conv? But this is probably getting ahead of things.

If this is being looked into then at least also SPIR-V kernel should be looked into in the same context.

workingjubilee · 2025-01-06T19:30:59Z

@kjetilkjeka applied your recommended edits to the issue.

workingjubilee · 2025-01-06T19:58:01Z

It's fine if the ABI is implemented slightly differently, IMO, if it's conceptually the same.
I would like us to attempt unification of these.

@Flakebi I'm fine with adding this as extern "gpu-kernel" or any other name you feel appropriate (extern "device-kernel"? extern "host"? whatever).

@kjetilkjeka SPIRV's OpEntryPoint is indeed another form of this.

workingjubilee · 2025-01-06T23:31:51Z

I'm not as certain if SPIRV's OpEntryPoint has the exact same "calls from 'within' the device make no sense" semantics, but it is more abstract on purpose, and it covers many different possible actual usages of the code... a given SPIRV module can have an entry point for every single step in the shader pipeline.

And they can all have the same name.

So... not really something we can easily map, I think, nor should try.

Flakebi · 2025-01-07T11:35:22Z

Good, thanks for shedding light into nvptx :)

I agree that amdgpu/nvptx kernel conceptually mean the same thing. We don’t want to guarantee that they behave exactly the same, some properties of a gpu-kernel calling convention will likely will depend on the target.
But I think that is fine, it’s the same for the C calling convention as well.

For SPIR-V, I’m not sure how well it can map to the same calling convention.

The part that’s the same is that it can (only?) be called from the host and it represents the entry-point on the GPU.
The part that’s different is how arguments can be passed. IIRC in the emitted SPIR-V, there are no arguments, everything of value is a global variable. Some things may be represented as arguments in the high-level language (e.g. HLSL uses arguments for some things), but these arguments are still used differently compared to amdgpu/nvptx kernels.

(Sidenot: The amdgpu LLVM backend uses different calling conventions for the different graphics shader types like pixel/vertex/compute shaders.)
As SPIR-V is currently not in rustc anyways, I think we can leave this for later anyway.

I think extern "gpu-kernel" or extern "device-kernel" is good (with a slight preference for extern "gpu-kernel", who knows if non-GPUs will ever want a similar calling convention).

…gjubilee Add gpu-kernel calling convention The amdgpu-kernel calling convention was reverted in commit f6b21e9 (rust-lang#120495 and rust-lang/rust-analyzer#16463) due to inactivity in the amdgpu target. Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for. Tracking issue: rust-lang#135467 amdgpu target tracking issue: rust-lang#135024

…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#134940 (Make sure to scrape region constraints from deeply normalizing type outlives assumptions in borrowck) - rust-lang#135047 (Add gpu-kernel calling convention) - rust-lang#135228 (Improve `DispatchFromDyn` and `CoerceUnsized` impl validation) - rust-lang#135264 (Consider more erroneous layouts as `LayoutError::ReferencesError` to suppress spurious errors) - rust-lang#135302 (for purely return-type based searches, deprioritize clone-like functions) - rust-lang#135380 (Make sure we can produce `ConstArgHasWrongType` errors for valtree consts) - rust-lang#135425 (Do not consider traits that have unsatisfied const conditions to be conditionally const) r? `@ghost` `@rustbot` modify labels: rollup

workingjubilee · 2025-01-15T03:42:49Z

tests/codegen/gpu-kernel-abi.rs

+//@ revisions: amdgpu nvptx
+//@ [amdgpu] compile-flags: --crate-type=rlib --target=amdgcn-amd-amdhsa -Ctarget-cpu=gfx900
+//@ [amdgpu] needs-llvm-components: amdgpu
+// amdgpu target is not yet merged
+//@ [amdgpu] should-fail


this doesn't really work:

thread 'main' panicked at src/tools/compiletest/src/header.rs:1559:17: missing LLVM component amdgpu, and COMPILETEST_REQUIRE_ALL_LLVM_COMPONENTS is set note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace Build completed unsuccessfully in 0:37:29 local time: Wed Jan 15 03:17:37 UTC 2025 network time: Wed, 15 Jan 2025 03:17:37 GMT

Oh, Rust’s testing is too advanced for this to go through 😃

workingjubilee · 2025-01-15T03:42:59Z

@bors r-

workingjubilee · 2025-01-15T03:54:22Z

@bors rollup=iffy

workingjubilee · 2025-01-15T10:43:00Z

( it might be simpler to just land the amdgpu target first )

The amdgpu-kernel calling convention was reverted in commit f6b21e9 due to inactivity in the amdgpu target. Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.

Flakebi · 2025-01-15T23:37:01Z

I removed the amdgpu part of the test and squashed commits. Diff

workingjubilee · 2025-01-17T04:17:50Z

@bors r+

bors · 2025-01-17T04:17:53Z

📌 Commit e7e5202 has been approved by workingjubilee

It is now in the queue for this repository.

bors · 2025-01-17T04:36:13Z

⌛ Testing commit e7e5202 with merge 0c2c096...

bors · 2025-01-17T07:19:38Z

☀️ Test successful - checks-actions
Approved by: workingjubilee
Pushing 0c2c096 to master...

rust-log-analyzer · 2025-01-17T07:48:46Z

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

Prepare workflow directory
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v4' (SHA:11bd71901bbe5b1630ceea73d27597364c9af683)
Complete job name: DockerHub mirror
with:
  persist-credentials: false
  repository: rust-lang/rust
  token: ***
---
http.https://github.com/.extraheader
[command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader
[command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
##[endgroup]
##[group]Run echo "***" | docker login ghcr.io -u rust-lang --password-stdin
echo "***" | docker login ghcr.io -u rust-lang --password-stdin
shell: /usr/bin/bash -e {0}
WARNING! Your password will be stored unencrypted in /home/runner/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store


Login Succeeded
##[group]Run curl -sL "https://github.com/google/go-containerregistry/releases/download/${VERSION}/go-containerregistry_${OS}_${ARCH}.tar.gz" | tar -xzf -
curl -sL "https://github.com/google/go-containerregistry/releases/download/${VERSION}/go-containerregistry_${OS}_${ARCH}.tar.gz" | tar -xzf -
shell: /usr/bin/bash -e {0}
---
  VERSION: v0.20.2
  OS: Linux
  ARCH: x86_64
##[endgroup]
##[group]Run # DockerHub image we want to mirror
# DockerHub image we want to mirror
image="ubuntu:22.04"

# Mirror image from DockerHub to ghcr.io
./crane copy \
  docker.io/${image} \
  ghcr.io/rust-lang/${image}
shell: /usr/bin/bash -e {0}
##[endgroup]
2025/01/17 07:48:18 Copying from docker.io/ubuntu:22.04 to ghcr.io/rust-lang/ubuntu:22.04
2025/01/17 07:48:21 pushed blob: sha256:6a0680948602c0463583859ad1c1af9ef783d231f95babcc7a45f7fc35ab6519
2025/01/17 07:48:21 pushed blob: sha256:91cfa7fda99ffc0881cf734cc88b92ba9fe8f587165ff167a097d265bc0bcb1c
2025/01/17 07:48:21 pushed blob: sha256:961b5577e6c5b1fdd517169f74499d1838c10e4f3718cecce954f693e47f36a4
2025/01/17 07:48:22 pushed blob: sha256:97271d29cb7956f0908cfb1449610a2cd9cb46b004ac8af25f0255663eb364ba
2025/01/17 07:48:22 pushed blob: sha256:3944851c58f2c886b94c4999cef08ab10b678b7195615ec2830f6c4fc42777ee
2025/01/17 07:48:22 pushed blob: sha256:78ba8700924815f6907dd8b2e50e43c8366ca6a0dd666d3f1ed67a2d399286a9
2025/01/17 07:48:22 ghcr.io/rust-lang/ubuntu@sha256:fbbbc3b83f7fb5d64c8ad86a44765c8eb4ceced4a43b00c58b2f625d2ed61676: digest: sha256:fbbbc3b83f7fb5d64c8ad86a44765c8eb4ceced4a43b00c58b2f625d2ed61676 size: 562
2025/01/17 07:48:22 ghcr.io/rust-lang/ubuntu@sha256:4e626f9c1a3cda3c405b8907ac8117f502446f2004a8688fcade23acbb31195b: digest: sha256:4e626f9c1a3cda3c405b8907ac8117f502446f2004a8688fcade23acbb31195b size: 562
2025/01/17 07:48:24 pushed blob: sha256:b477f0f37762a62f631ac4fbaed78c3b23c47db7ac1eaefe95bda0e85ce052a0
2025/01/17 07:48:24 pushed blob: sha256:6414378b647780fee8fd903ddb9541d134a1947ce092d08bdeb23a54cb3684ac
2025/01/17 07:48:24 pushed blob: sha256:9332f337e9df816ba995aae285128e8b121ec47bdb101d5de2c9544847d22758
2025/01/17 07:48:24 ghcr.io/rust-lang/ubuntu@sha256:17d75c44d078810c2264962f31c2c3c8a5889ec05edd4e2a00e273a5e1219ebf: digest: sha256:17d75c44d078810c2264962f31c2c3c8a5889ec05edd4e2a00e273a5e1219ebf size: 424
2025/01/17 07:48:24 retrying without mount: POST https://ghcr.io/v2/rust-lang/ubuntu/blobs/uploads/?from=library%2Fubuntu&mount=sha256%3Aa186900671ab62e1dea364788f4e84c156e1825939914cfb5a6770be2b58b4da&origin=REDACTED: DENIED: permission_denied: write_package
2025/01/17 07:48:24 ghcr.io/rust-lang/ubuntu@sha256:3d1556a8a18cf5307b121e0a98e93f1ddf1f3f8e092f1fddfd941254785b95d7: digest: sha256:3d1556a8a18cf5307b121e0a98e93f1ddf1f3f8e092f1fddfd941254785b95d7 size: 424
2025/01/17 07:48:25 pushed blob: sha256:981912c48e9a89e903c89b228be977e23eeba83d42e2c8e0593a781a2b251cba
2025/01/17 07:48:25 pushed blob: sha256:0d19c67405fda483fbcc49395ea5ad434d3e8eb869d31c1299343dbf3e16f137
2025/01/17 07:48:25 pushed blob: sha256:6ff7ea5b4e8ca182006855f3fca8147b1e0089f2e2d583daa5427ab9e105c226
2025/01/17 07:48:26 pushed blob: sha256:476310fe4fa16f14973ecd8988b22bda0e695dbd6a6d2ac4650987bd3898e779
2025/01/17 07:48:26 ghcr.io/rust-lang/ubuntu@sha256:e5759e2b46d96e97d8b55718bfa25ff0ab94afc675bcbdcc75ece806e75eb0ef: digest: sha256:e5759e2b46d96e97d8b55718bfa25ff0ab94afc675bcbdcc75ece806e75eb0ef size: 562
2025/01/17 07:48:26 pushed blob: sha256:21871807ede976118f4095fcc9242c96870d500b449901f1125412100b29a0a0
2025/01/17 07:48:27 ghcr.io/rust-lang/ubuntu@sha256:9727e060ae8adf67b5180ef47749720b6e88503e50167da84eab32faf47cc12f: digest: sha256:9727e060ae8adf67b5180ef47749720b6e88503e50167da84eab32faf47cc12f size: 562
2025/01/17 07:48:27 pushed blob: sha256:a186900671ab62e1dea364788f4e84c156e1825939914cfb5a6770be2b58b4da
2025/01/17 07:48:27 ghcr.io/rust-lang/ubuntu@sha256:7c75ab2b0567edbb9d4834a2c51e462ebd709740d1f2c40bcd23c56e974fe2a8: digest: sha256:7c75ab2b0567edbb9d4834a2c51e462ebd709740d1f2c40bcd23c56e974fe2a8 size: 424
2025/01/17 07:48:27 pushed blob: sha256:bd389594e541fc722f244791a495e1a62a526cb95daeea3d2304d9be4e2f0e2a
2025/01/17 07:48:28 pushed blob: sha256:4f832cd75adcc173feb21f93670766fa9caac4e5c652fee73f8ec5b72ce39ac0
2025/01/17 07:48:28 pushed blob: sha256:e091d008a3a95d995fcb7c609f2ba781d4de1ebaa917c98ab01b95bd0c5b551b
2025/01/17 07:48:29 ghcr.io/rust-lang/ubuntu@sha256:15abdf6cd20e250f2cd5796047ca8370c45c70a2a1279fe0c37c061116d9a525: digest: sha256:15abdf6cd20e250f2cd5796047ca8370c45c70a2a1279fe0c37c061116d9a525 size: 424
2025/01/17 07:48:29 pushed blob: sha256:6579073ec343e2eff1822992b7ff52b06a067be5c14999033789394aadca585d
2025/01/17 07:48:29 pushed blob: sha256:ed26e1294c3004f5c637a31c61c0c988fde3a03baf4a989669aecca6dcc1ee8b
2025/01/17 07:48:29 pushed blob: sha256:40b0d4008b4e81651b2cf6e89af8946d4d616c267fe863f9a43100a72b585038
2025/01/17 07:48:29 ghcr.io/rust-lang/ubuntu@sha256:e04b34e48954230514d2d2178f4eb0f9629caa21700c263680fc8cc1e0b87b71: digest: sha256:e04b34e48954230514d2d2178f4eb0f9629caa21700c263680fc8cc1e0b87b71 size: 424
2025/01/17 07:48:29 ghcr.io/rust-lang/ubuntu@sha256:b1a48753c24aeb7c76f5691efb6a38bd59870130eb06e57687181356545e975b: digest: sha256:b1a48753c24aeb7c76f5691efb6a38bd59870130eb06e57687181356545e975b size: 562
2025/01/17 07:48:30 pushed blob: sha256:41e9fbd89079d8e47609ae158236d59896fd2503db1ebdfef058864054170e01
2025/01/17 07:48:31 pushed blob: sha256:01a6e1f2fa08c55916aac7357ff9ee2edfa26a3c2d84d30ed307ab38682f5c57
2025/01/17 07:48:33 ghcr.io/rust-lang/ubuntu@sha256:af903bc2047b8dd70bb96aaf5a35df3b722966b7c6659cb37001535f0d643d9c: digest: sha256:af903bc2047b8dd70bb96aaf5a35df3b722966b7c6659cb37001535f0d643d9c size: 424
2025/01/17 07:48:33 pushed blob: sha256:33073d992c1ed5c3bb512bfbf0c7fad8c68b1ccf54e0c2d83321c40a42b43c0a
2025/01/17 07:48:33 ghcr.io/rust-lang/ubuntu@sha256:a5e64271c378f07afca9473a3cf82db16c7b4eabe8111835d48c9e165230f20d: digest: sha256:a5e64271c378f07afca9473a3cf82db16c7b4eabe8111835d48c9e165230f20d size: 562
Error: PUT https://ghcr.io/v2/rust-lang/ubuntu/manifests/22.04: DENIED: permission_denied: write_package
##[error]Process completed with exit code 1.

rust-timer · 2025-01-17T09:33:22Z

Finished benchmarking commit (0c2c096): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -0.4%, secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.9%	[0.9%, 0.9%]	1
Regressions ❌ (secondary)	3.4%	[2.5%, 4.3%]	2
Improvements ✅ (primary)	-1.0%	[-1.5%, -0.5%]	2
Improvements ✅ (secondary)	-3.3%	[-5.4%, -1.2%]	2
All ❌✅ (primary)	-0.4%	[-1.5%, 0.9%]	3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 763.565s -> 764.62s (0.14%)
Artifact size: 326.08 MiB -> 326.02 MiB (-0.02%)

…ubilee Add gpu-kernel calling convention The amdgpu-kernel calling convention was reverted in commit f6b21e9 (rust-lang#120495 and rust-lang/rust-analyzer#16463) due to inactivity in the amdgpu target. Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for. Tracking issue: rust-lang#135467 amdgpu target tracking issue: rust-lang#135024

rustbot assigned SparrowLii Jan 2, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 2, 2025

Flakebi mentioned this pull request Jan 2, 2025

Tracking Issue for amdgpu target #135024

Open

12 tasks

This comment has been minimized.

Sign in to view

workingjubilee assigned workingjubilee and unassigned SparrowLii Jan 3, 2025

workingjubilee added S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 4, 2025

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jan 14, 2025

workingjubilee removed the S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? label Jan 14, 2025

matthiaskrgr mentioned this pull request Jan 14, 2025

Rollup of 8 pull requests #135506

Closed

matthiaskrgr mentioned this pull request Jan 14, 2025

Rollup of 7 pull requests #135509

Closed

workingjubilee mentioned this pull request Jan 14, 2025

Tracking issue for the "ptx-kernel" ABI #38788

Open

14 tasks

workingjubilee reviewed Jan 15, 2025

View reviewed changes

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 15, 2025

Flakebi force-pushed the amdgpu-kernel-cc branch from 68b2639 to e7e5202 Compare January 15, 2025 23:36

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 17, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 17, 2025

bors merged commit 0c2c096 into rust-lang:master Jan 17, 2025
7 checks passed

rustbot added this to the 1.86.0 milestone Jan 17, 2025

Flakebi deleted the amdgpu-kernel-cc branch January 17, 2025 09:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gpu-kernel calling convention #135047

Add gpu-kernel calling convention #135047

Flakebi commented Jan 2, 2025 •

edited

Loading

rustbot commented Jan 2, 2025

rustbot commented Jan 2, 2025

This comment has been minimized.

This comment has been minimized.

workingjubilee commented Jan 3, 2025

bjorn3 commented Jan 3, 2025

Flakebi commented Jan 3, 2025

workingjubilee commented Jan 3, 2025

workingjubilee commented Jan 3, 2025

Flakebi commented Jan 3, 2025 •

edited

Loading

workingjubilee commented Jan 3, 2025

Flakebi commented Jan 3, 2025

workingjubilee commented Jan 4, 2025

workingjubilee commented Jan 4, 2025

workingjubilee commented Jan 4, 2025

Flakebi commented Jan 6, 2025

bjorn3 commented Jan 6, 2025

kjetilkjeka commented Jan 6, 2025 •

edited

Loading

workingjubilee commented Jan 6, 2025

workingjubilee commented Jan 6, 2025

workingjubilee commented Jan 6, 2025 •

edited

Loading

Flakebi commented Jan 7, 2025

workingjubilee Jan 15, 2025

Flakebi Jan 15, 2025

workingjubilee commented Jan 15, 2025

workingjubilee commented Jan 15, 2025

workingjubilee commented Jan 15, 2025

Flakebi commented Jan 15, 2025

workingjubilee commented Jan 17, 2025

bors commented Jan 17, 2025

bors commented Jan 17, 2025

bors commented Jan 17, 2025

rust-log-analyzer commented Jan 17, 2025

rust-timer commented Jan 17, 2025

Add gpu-kernel calling convention #135047

Add gpu-kernel calling convention #135047

Conversation

Flakebi commented Jan 2, 2025 • edited Loading

rustbot commented Jan 2, 2025

rustbot commented Jan 2, 2025

This comment has been minimized.

This comment has been minimized.

workingjubilee commented Jan 3, 2025

bjorn3 commented Jan 3, 2025

Flakebi commented Jan 3, 2025

workingjubilee commented Jan 3, 2025

workingjubilee commented Jan 3, 2025

Flakebi commented Jan 3, 2025 • edited Loading

workingjubilee commented Jan 3, 2025

Flakebi commented Jan 3, 2025

workingjubilee commented Jan 4, 2025

workingjubilee commented Jan 4, 2025

workingjubilee commented Jan 4, 2025

Flakebi commented Jan 6, 2025

bjorn3 commented Jan 6, 2025

kjetilkjeka commented Jan 6, 2025 • edited Loading

workingjubilee commented Jan 6, 2025

workingjubilee commented Jan 6, 2025

workingjubilee commented Jan 6, 2025 • edited Loading

Flakebi commented Jan 7, 2025

workingjubilee Jan 15, 2025

Choose a reason for hiding this comment

Flakebi Jan 15, 2025

Choose a reason for hiding this comment

workingjubilee commented Jan 15, 2025

workingjubilee commented Jan 15, 2025

workingjubilee commented Jan 15, 2025

Flakebi commented Jan 15, 2025

workingjubilee commented Jan 17, 2025

bors commented Jan 17, 2025

bors commented Jan 17, 2025

bors commented Jan 17, 2025

rust-log-analyzer commented Jan 17, 2025

rust-timer commented Jan 17, 2025

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Flakebi commented Jan 2, 2025 •

edited

Loading

Flakebi commented Jan 3, 2025 •

edited

Loading

kjetilkjeka commented Jan 6, 2025 •

edited

Loading

workingjubilee commented Jan 6, 2025 •

edited

Loading