From 9453d4269f4697186bc36694c7d598dfeeae0100 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 22 Dec 2023 18:02:51 +0800 Subject: [PATCH 01/12] Standard Fixed-length Vector Calling Convention Variant This proposal outlines a new variant of the calling convention specifically designed for fixed-length vectors. The primary aim of this variant is to facilitate the passing of fixed-length vectors through vector registers. This approach is derived from the standard vector calling convention, it uses the same register conventions and argument passing and return value rules. A key aspect of this variant is the introduction of ABI_VLEN, which denotes the width of a vector register within this convention. The ABI_VLEN is constrained to be no wider than the ISA's VLEN (Vector Length), ensuring compatibility while allowing for flexibility in different implementations. This parameter can be configured via compiler command line options or through function attributes in source code. The document recommends setting the default ABI_VLEN to 128 bits, acknowledging it as a common minimal requirement while allowing the flexibility for lower VLEN (32 or 64 bits) as permitted by the ISA. This flexibility is crucial for optimizing the utilization of longer VLENs in various cores. The proposal specifies how fixed-length vector arguments are passed based on their size relative to ABI_VLEN. Vectors smaller than ABI_VLEN are passed in a single vector argument register, while larger vectors are passed in multiple registers, following the LMUL (Length Multiplier) pattern of 2, 4, or 8, depending on their size. Additionally, the proposal addresses the handling of structs and unions containing fixed-length vectors. Structs with members that are all fixed-length vectors follow the vector tuple type rules if they conform to size constraints. In contrast, unions with fixed-length vectors adhere to the integer calling convention. --- riscv-cc.adoc | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index f6ab1882..171e9224 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -428,6 +428,81 @@ NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers all vector registers. Hence, the standard vector calling convention variant won't disrupt the `jmp_buf` ABI. +=== Standard Fixed-length Vector Calling Convention Variant + +This section defines the calling convention variant for fixed-length vectors. +The intention of this variant is to pass fixed-length vectors via the vector +register. For the definition of a fixed-length vector, see +<>. + +This variant is based on the standard vector calling convention variant: +the register convention and the rules for passing arguments and return values +are the same. + +NOTE: The reason we define a separate calling convention variant is that we +would like to define a flexible convention to utilize the variable length +feature in the vector extension, also considering embedded vector extensions, +such as zve32x. + +ABI_VLEN refers to the width of a vector register in the calling convention +variant. + +The ABI_VLEN must be no wider than the ISA's VLEN, meaning that the ISA may +support wider vector registers than the ABI, but the ABI's VLEN cannot exceed +the ISA's VLEN. + +The ABI_VLEN is a parameter of this calling convention variant. It could be set +by the command line option for the compiler or specified by the function +attribute in the source code. + +NOTE: We suggest the toolchain implementation set the default value of ABI_VLEN +to 128, as it's the most common minimal requirement. However, it is not fixed +to 128, since the ISA allows the VLEN to be only 32 bits or 64 bits. This +also enables the utilization of the capacity of longer VLEN. Users can build +with an optimized library with larger ABI_VLEN for better utilization of those +cores with longer VLEN. + +A fixed-length vector argument is passed in a vector argument register if the +size of the vector is less than ABI_VLEN bit. + +A fixed-length vector argument is passed in two vector argument registers, +similar to vector data arguments with LMUL=2, if the size of the vector is +greater than ABI_VLEN bit and less than or equal to 2×ABI_VLEN bit. + +A fixed-length vector argument is passed in four vector argument registers, +similar to vector data arguments with LMUL=4, if the size of the vector is +greater than 2×ABI_VLEN bit and less than or equal to 4×ABI_VLEN bit. + +A fixed-length vector argument is passed in eight vector argument registers, +similar to vector data arguments with LMUL=8, if the size of the vector is +greater than 4×ABI_VLEN bit and less than or equal to 8×ABI_VLEN bit. + +A struct containing members with all fixed-length vectors will be passed in +vector argument registers like a vector tuple type if all members have the +same length, the length is less or equal to 8×ABI_VLEN bit, and the size of +the whole struct is less than 8×ABI_VLEN bit. Otherwise, it will use the rule +defined in the hardware floating-point calling convention. + +A struct containing just one fixed-length vector array is passed as though it +were a vector tuple type if the size of the base element for the array is less +or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN +bit. Otherwise, it will use the rule defined in the hardware floating-point +calling convention. + +A fixed-length vector argument is passed by reference and is replaced in the +argument list with the address if it is larger than 8×ABI_VLEN bit or if +there is a shortage of vector argument registers. + +Unions with fixed-length vectors are always passed according to the integer +calling convention. + +The details of vector argument register rules are the same as the standard +vector calling convention variant. + +NOTE: Functions that use the standard fixed-length vector calling convention +variant must be marked with STO_RISCV_VARIANT_CC. See <> +for the meaning of STO_RISCV_VARIANT_CC. + === ILP32E Calling Convention IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the From 2df47cd350aa7e673420cb35d8401499bf23f303 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 26 Jan 2024 17:59:07 +0800 Subject: [PATCH 02/12] Minor revision - Reorder rule. - Pass struct as tuple-type in register only when vector arg reg is enough, otherwise passed in reference. - Add NOTE for describe what if ABI_VLEN is smaller than VLEN, also come with an example. - Add NOTE for describe different functions may use different ABI_VLEN values. --- riscv-cc.adoc | 45 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 171e9224..8b53fb2e 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -477,11 +477,18 @@ A fixed-length vector argument is passed in eight vector argument registers, similar to vector data arguments with LMUL=8, if the size of the vector is greater than 4×ABI_VLEN bit and less than or equal to 8×ABI_VLEN bit. +A fixed-length vector argument is passed by reference and is replaced in the +argument list with the address if it is larger than 8×ABI_VLEN bit or if +there is a shortage of vector argument registers. + A struct containing members with all fixed-length vectors will be passed in vector argument registers like a vector tuple type if all members have the same length, the length is less or equal to 8×ABI_VLEN bit, and the size of -the whole struct is less than 8×ABI_VLEN bit. Otherwise, it will use the rule -defined in the hardware floating-point calling convention. +the whole struct is less than 8×ABI_VLEN bit. +If there are not enough vector argument registers to pass the entire struct, +it will pass by reference and is replaced in the argument list with the address. +Otherwise, it will use the rule defined in the hardware floating-point calling +convention. A struct containing just one fixed-length vector array is passed as though it were a vector tuple type if the size of the base element for the array is less @@ -489,10 +496,6 @@ or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN bit. Otherwise, it will use the rule defined in the hardware floating-point calling convention. -A fixed-length vector argument is passed by reference and is replaced in the -argument list with the address if it is larger than 8×ABI_VLEN bit or if -there is a shortage of vector argument registers. - Unions with fixed-length vectors are always passed according to the integer calling convention. @@ -503,6 +506,36 @@ NOTE: Functions that use the standard fixed-length vector calling convention variant must be marked with STO_RISCV_VARIANT_CC. See <> for the meaning of STO_RISCV_VARIANT_CC. +[NOTE] +==== +When ABI_VLEN is smaller than the VLEN, the number of vector argument +registers utilized remains unchanged. However, in such cases, values are only +placed in a portion of these vector argument registers, corresponding to the +size of ABI_VLEN. The remaining portion of the vector argument registers, which +extends beyond the ABI_VLEN, will remain idle. This means that while the full +capacity of the vector argument registers may not be used, the allocation of +these registers do not change, ensuring consistency in register usage regardless +of the ABI_VLEN to VLEN ratio. + +Example: With ABI_VLEN at 32 bits and VLEN at 128 bits, consider passing an +`int32x4_t` parameter (four 32-bit integers). + +Allocation: Four vector argument registers are allocated for +`int32x4_t`, based on LMUL=4. + +Utilization: All four integers are placed in the first vector register, +utilizing its full 128-bit capacity (VLEN), despite ABI_VLEN being 32 bits. + +Remaining Registers: The other three allocated registers remain unused and idle. +==== + +NOTE: In a single compilation unit, different functions may use different +ABI_VLEN values. This means that ABI_VLEN is not uniform across the entire unit, +allowing for function-specific optimization. However, this necessitates that +users ensure consistency in ABI_VLEN between calling and called functions. It +is the user's responsibility to verify that the ABI_VLEN matches on both sides +of a function call to ensure correct operation and data handling. + === ILP32E Calling Convention IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the From 4da348a5d31d329335fb7e91b0784e8782d4b212 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Mon, 29 Jan 2024 16:42:35 +0800 Subject: [PATCH 03/12] Minor revision - Add rule for single fixed-length vector or fixed-length vector array with size 1. - Add rule for zero-length fixed-length arrays. - Add explicitly rule for fixed-length vector struct as vector tuple type: pass by ref if no enough arg register. --- riscv-cc.adoc | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 8b53fb2e..438e2928 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -490,10 +490,21 @@ it will pass by reference and is replaced in the argument list with the address. Otherwise, it will use the rule defined in the hardware floating-point calling convention. +A struct containing just one fixed-length vector or on fixed-length vector +array of length one, it will flattening as a single fixed-length vector argument +if the size of the vector is less than or equal to 8×ABI_VLEN bit. + +Struct with zero-length fixed-length arrays use the rule defined in the hardware +floating-point calling convention, which means it won't consume vector argument +register eitehr in C or {Cpp}. + A struct containing just one fixed-length vector array is passed as though it were a vector tuple type if the size of the base element for the array is less or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN -bit. Otherwise, it will use the rule defined in the hardware floating-point +bit. +If there are not enough vector argument registers to pass the entire struct, +it will pass by reference and is replaced in the argument list with the address. +Otherwise, it will use the rule defined in the hardware floating-point calling convention. Unions with fixed-length vectors are always passed according to the integer From 1c1903153e1e2eacd75b7c4b1b45f91ad511d2d1 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 15:44:43 +0800 Subject: [PATCH 04/12] Apply suggestions from code review Co-authored-by: Brandon Wu Signed-off-by: Kito Cheng --- riscv-cc.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 438e2928..4c34d3ba 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -483,15 +483,15 @@ there is a shortage of vector argument registers. A struct containing members with all fixed-length vectors will be passed in vector argument registers like a vector tuple type if all members have the -same length, the length is less or equal to 8×ABI_VLEN bit, and the size of -the whole struct is less than 8×ABI_VLEN bit. +same length, the length is less than or equal to 4×ABI_VLEN bit, and the size of +the whole struct is less than or equal to 8×ABI_VLEN bit. If there are not enough vector argument registers to pass the entire struct, it will pass by reference and is replaced in the argument list with the address. Otherwise, it will use the rule defined in the hardware floating-point calling convention. -A struct containing just one fixed-length vector or on fixed-length vector -array of length one, it will flattening as a single fixed-length vector argument +A struct containing just one fixed-length vector or a fixed-length vector +array of length one, it will be flattened as a single fixed-length vector argument if the size of the vector is less than or equal to 8×ABI_VLEN bit. Struct with zero-length fixed-length arrays use the rule defined in the hardware @@ -499,7 +499,7 @@ floating-point calling convention, which means it won't consume vector argument register eitehr in C or {Cpp}. A struct containing just one fixed-length vector array is passed as though it -were a vector tuple type if the size of the base element for the array is less +were a vector tuple type if the size of the base element for the array is less than or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN bit. If there are not enough vector argument registers to pass the entire struct, From 4902cef0d41621286dc7cd09f04505b9ee15fb0c Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 2 Feb 2024 21:37:13 +0800 Subject: [PATCH 05/12] Name Mangling for Standard Calling Convention Variant --- riscv-cc.adoc | 4 ++++ riscv-elf.adoc | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 4c34d3ba..911732f2 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -428,6 +428,10 @@ NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers all vector registers. Hence, the standard vector calling convention variant won't disrupt the `jmp_buf` ABI. +NOTE: Functions that use the standard vector calling convention +variant follow an additional name mangling rule for {Cpp}. +For more details, see <>. + === Standard Fixed-length Vector Calling Convention Variant This section defines the calling convention variant for fixed-length vectors. diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 08d948a5..f7cd7796 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -202,6 +202,34 @@ See the "Type encodings" section in _Itanium {Cpp} ABI_ for more detail on how to mangle types. Note that `__bf16` is mangled in the same way as `std::bfloat16_t`. +=== Name Mangling for Standard Calling Convention Variant + +Function use standard calling convention variant have to append extra ABI tag to +the function name mangling, the rule are same as the "ABI tags" section in +_Itanium {Cpp} ABI_. + +.ABI Tag name for calling convention variants +[cols="5,2"] +[width=80%] +|=== +| Name | ABI tag name + +| Standard vector calling convention variant | riscv_vector_cc +|=== + + +For example: +[,c] +---- + __attribute__((riscv_vector_cc)) void foo(); +---- + +is mangled as +[,c] +---- + _Z3fooB15riscv_vector_ccv +---- + === Name Mangling for Vector Data Types, Vector Mask Types and Vector Tuple Types. The vector data types and vector mask types, as defined in the section From 76c1816f3661257caeae893a8bbd7c28914636fc Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 2 Feb 2024 21:37:38 +0800 Subject: [PATCH 06/12] Name mangling for standard fixed-length vector calling convention --- riscv-cc.adoc | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 911732f2..cb40c8ff 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -521,6 +521,10 @@ NOTE: Functions that use the standard fixed-length vector calling convention variant must be marked with STO_RISCV_VARIANT_CC. See <> for the meaning of STO_RISCV_VARIANT_CC. +NOTE: Functions that use the standard fixed-length vector calling convention +variant follow an additional name mangling rule for {Cpp}. +For more details, see <>. + [NOTE] ==== When ABI_VLEN is smaller than the VLEN, the number of vector argument From 2140aa92d6f0a808cb17afdd582deaa91b109029 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 15:45:46 +0800 Subject: [PATCH 07/12] Tweak wording per Luke's suggesion --- riscv-cc.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index cb40c8ff..26aa9b93 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -467,7 +467,7 @@ with an optimized library with larger ABI_VLEN for better utilization of those cores with longer VLEN. A fixed-length vector argument is passed in a vector argument register if the -size of the vector is less than ABI_VLEN bit. +size of the vector is no more than ABI_VLEN bits. A fixed-length vector argument is passed in two vector argument registers, similar to vector data arguments with LMUL=2, if the size of the vector is From c26a99b693f7183c39fb2d1f93e818ba0a342b1b Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 17:40:42 +0800 Subject: [PATCH 08/12] Minor tweak --- riscv-cc.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 26aa9b93..3d1d5c2c 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -446,7 +446,7 @@ are the same. NOTE: The reason we define a separate calling convention variant is that we would like to define a flexible convention to utilize the variable length feature in the vector extension, also considering embedded vector extensions, -such as zve32x. +such as `Zve32x`. ABI_VLEN refers to the width of a vector register in the calling convention variant. From 7dd1c9ed5a447ecd4c045a2bca7c2716ef8eb40f Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 17:41:04 +0800 Subject: [PATCH 09/12] More descprtion for ABI_VLEN --- riscv-cc.adoc | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 3d1d5c2c..7c4b081f 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -455,6 +455,12 @@ The ABI_VLEN must be no wider than the ISA's VLEN, meaning that the ISA may support wider vector registers than the ABI, but the ABI's VLEN cannot exceed the ISA's VLEN. +ABI_VLEN represents the width (in bits) of the vector register available in the +calling convention for fixed-length vectors. ABI_VLEN can vary from 32 bits +(as in `Zve32x`) up to the maximum supported by the ISA. The flexibility of +ABI_VLEN enables the convention to adapt to both low-end embedded systems and +high-performance processors that utilize wider vector registers. + The ABI_VLEN is a parameter of this calling convention variant. It could be set by the command line option for the compiler or specified by the function attribute in the source code. From 11b47766208adc3c9ee053f8aa29b67b1d97d8bd Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 17:41:53 +0800 Subject: [PATCH 10/12] Add note to vector type with unsupported element type --- riscv-cc.adoc | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 7c4b081f..b6dc36c7 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -475,6 +475,27 @@ cores with longer VLEN. A fixed-length vector argument is passed in a vector argument register if the size of the vector is no more than ABI_VLEN bits. +[NOTE] +=== +Even in the absence of specific vector extension support for certain element +types, such as `__bf16`, `_Float16`, `float`, or `double`, the standard +fixed-length vector calling convention rules still apply. For example, +even without the support of extensions like `Zvfbfmin`, `Zve32f`, or `Zve64d`, +these element types will be passed according to the calling convention rules +outlined here. + +Additionally, data types such as `__int128_t`, which currently do not +have direct support in any vector extension, will also follow these rules. +This design ensures that the calling convention remains forward-compatible, +minimizing the need for continuous adjustments as new extensions and data types +are introduced in the future. + +The consistency in applying these rules to unsupported element types guarantees +a smooth transition when future vector extensions become available, allowing for +seamless integration of new features without requiring significant changes to +the calling convention. +=== + A fixed-length vector argument is passed in two vector argument registers, similar to vector data arguments with LMUL=2, if the size of the vector is greater than ABI_VLEN bit and less than or equal to 2×ABI_VLEN bit. From 7e9d68c6aa3fe2a0857fd1cf22f61a46d2d9f28f Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 5 Sep 2024 17:42:32 +0800 Subject: [PATCH 11/12] Add rule for non-power-of-2 vector --- riscv-cc.adoc | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index b6dc36c7..9d70387e 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -508,6 +508,25 @@ A fixed-length vector argument is passed in eight vector argument registers, similar to vector data arguments with LMUL=8, if the size of the vector is greater than 4×ABI_VLEN bit and less than or equal to 8×ABI_VLEN bit. +[NOTE] +=== +Fixed-length vectors that are not a power-of-2 in size will be rounded up to +the next power-of-2 length for the purpose of register allocation and handling. +For instance, a vector type like `int32x3_t` (which contains three 32-bit +integers) will be treated as an `int32x4_t` (a 128-bit vector, as LMUL=1) in +the ABI, and passed accordingly. This ensures consistency in how vectors are +handled and simplifies the process of argument passing. + +Example: Consider an `int32x3_t` vector (three 32-bit integers): +- The vector's total size is 96 bits, which is not a power of 2. +- The ABI will round up the size to 128 bits (corresponding to `int32x4_t`), + meaning the vector will be passed using one vector argument register when + ABI_VLEN=128. + +This rule applies to all non-power-of-2 fixed-length vectors, ensuring they +are treated consistently across different ABI_VLEN settings. +=== + A fixed-length vector argument is passed by reference and is replaced in the argument list with the address if it is larger than 8×ABI_VLEN bit or if there is a shortage of vector argument registers. From 094be883f0ac15bf23478e69a52f2dfeb8a7c66f Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Mon, 2 Dec 2024 21:27:42 +0800 Subject: [PATCH 12/12] Update riscv-cc.adoc Signed-off-by: Kito Cheng --- riscv-cc.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/riscv-cc.adoc b/riscv-cc.adoc index 9d70387e..86f15346 100644 --- a/riscv-cc.adoc +++ b/riscv-cc.adoc @@ -473,7 +473,7 @@ with an optimized library with larger ABI_VLEN for better utilization of those cores with longer VLEN. A fixed-length vector argument is passed in a vector argument register if the -size of the vector is no more than ABI_VLEN bits. +size of the vector is less than or equal to ABI_VLEN bit. [NOTE] ===