Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POD Specialization constants design update #2669

Merged
4 changes: 3 additions & 1 deletion sycl/doc/CompilerAndRuntimeDesign.md
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,9 @@ unit)
TBD

##### Specialization constants lowering
TBD

See [corresponding documentation](SpecializationConstants.md)


#### CUDA support

Expand Down
123 changes: 43 additions & 80 deletions sycl/doc/SpecializationConstants.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@ with some restrictions. See this [document](https://github.com/intel/llvm/blob/s
- must work with separate compilation and linking
- must support AOT compilation

Implementaion is based on SPIR-V specialization constants. But there is one
important difference between SYCL and SPIR-V: in SYCL speciazation constants are
identified by a type ID which is mapped to a symbolic name, in SPIR-V - by an
ordinal number. This complicates the design, as the compiler
Implementation is based on SPIR-V specialization constants. But there is one
important difference between SYCL and SPIR-V: in SYCL specialization constants
are identified by a type ID which is mapped to a symbolic name, in SPIR-V - by
an ordinal number. This complicates the design, as the compiler
1) needs to propagate symbolic =\> numeric ID correspondence to the runtime
2) can assign numeric IDs only when linking due to the separate compilation

Simple source code example:

```
```c++
class MyInt32Const;
...
// ...
sycl::program p(q.get_context());
sycl::ONEAPI::experimental::spec_constant<int32_t, MyInt32Const> i32 =
p.set_spec_constant<MyInt32Const>(rt_val);
Expand All @@ -34,7 +34,7 @@ class MyInt32Const;
acc[0] = i32.get();
});
});
...
// ...
```

## Design
Expand All @@ -46,9 +46,9 @@ primitive numeric types. POD types support is described further in the document.

Key `spec_constant::get()` function implementation for the device code:

```
```c++
template <typename T, typename ID = T> class spec_constant {
...
// ...
public:
T get() const { // explicit access.
#ifdef __SYCL_DEVICE_ONLY__
Expand All @@ -66,7 +66,7 @@ recognized by a special LLVM pass later.

Compilation and subsequent linkage of device code results in a number of
`__sycl_getSpecConstantValue` calls whose arguments are symbolic spec constant
IDs. Before generating the a device binary, each linked device code LLVMIR
IDs. Before generating a device binary, each linked device code LLVMIR
module undergoes processing by the sycl-post-link tool which can run LLVMIR
passes before passing the module onto the llvm-spirv translator.

Expand All @@ -87,9 +87,9 @@ After this pass the sycl-post-link tool will output the
attaching this info to the device binary image via the offload wrapper tool as
a property set:

```
```c++
struct pi_device_binary_struct {
...
// ...
// Array of preperty sets; e.g. specialization constants symbol-int ID map is
// propagated to runtime with this mechanism.
pi_device_binary_property_set PropertySetsBegin;
Expand Down Expand Up @@ -152,7 +152,7 @@ unaware of the clang-specific built-ins.
Before JIT-ing a program, the runtime "flushes" the spec constants: it iterates
through the value map and invokes the

```
```c++
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved
pi_result piextProgramSetSpecializationConstant(pi_program prog,
pi_uint32 spec_id,
size_t spec_size,
Expand All @@ -167,7 +167,7 @@ Plugin Interface function for each entry, taking the `spec_id` from the ID map.

Say, the POD type is

```
```c++
struct A {
int x;
float y;
Expand All @@ -181,7 +181,7 @@ struct POD {

and the user says

```
```c++
POD gold{
{
{ goldi, goldf },
Expand All @@ -200,93 +200,56 @@ and the user says
- The SpecConstants pass in the post-link will have the following IR as input (`sret` conversion is omitted for clarity):

```
%spec_const = call %struct.POD __sycl_getCompositeSpecConstantValue<POD type mangling> ("MyConst_mangled")
```
%struct.POD = type { [2 x %struct.A], i32 }
%struct.A = type { i32, float }

where `__sycl_getCompositeSpecConstantValue` is a new "intrinsic"
(in addition to `__sycl_getSpecConstantValue`) recognized by SpecConstants pass,
which creates a value of a composite (of non-primitive type) specialization constant.
It does not need a default value, because its default value consists of default
valued of its leaf specialization constants (see below).
%spec_const = call %struct.POD __sycl_getSpecConstantValue<POD type mangling> ("MyConst_mangled")
```

- after spec constant enumeration (symbolic -\> int ID translation), the SpecConstants pass
will handle the `__sycl_getCompositeSpecConstantValue`. Given the knowledge of the composite
specialization constant's type (`%struct.POD`), the pass will traverse its leaf
fields and generate 5 "primitive" spec constants using already existing SPIR-V intrinsic:
Based on the fact that `__sycl_getSpecConstantValue` returns `llvm::StructType`,
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved
it will be replaced with a set of `__spirv_SpecConstant` calls for each member
of its return type plus one `__spirv_SpecConstantComposite` to gather members
back into a single composite. If any composite member is another composite, then
it will be also represented by number of `__spirv_SpecConstant` plus one
`__spirv_SpecConstantComposite`:

```
%gold_POD_a0x = call i32 __spirv_SpecConstant(i32 10, i32 0)
%gold_POD_a0y = call float __spirv_SpecConstant(i32 11, float 0)
%gold_POD_a1x = call i32 __spirv_SpecConstant(i32 12, i32 0)
%gold_POD_a1y = call float __spirv_SpecConstant(i32 13, float 0)
%gold_POD_A0_x = call i32 __spirv_SpecConstant(i32 10, i32 0)
%gold_POD_A0_y = call float __spirv_SpecConstant(i32 11, float 0)

%gold_POD_A0 = call %struct.A __spirv_SpecConstantComposite(i32 %gold_POD_A0_x, float %gold_POD_A0_y)

%gold_POD_A1_x = call i32 __spirv_SpecConstant(i32 12, i32 0)
%gold_POD_A1_y = call float __spirv_SpecConstant(i32 13, float 0)

%gold_POD_A1 = call %struct.A __spirv_SpecConstantComposite(i32 %gold_POD_A1_x, float %gold_POD_A1_y)

%gold_POD_A = call [2 x %struct.A] __spirv_SpecConstantComposite(%struct.A %gold_POD_A0, %struct.A %gold_POD_A1)
%gold_POD_b = call i32 __spirv_SpecConstant(i32 14, i32 0)
```

And 1 "composite"
%gold = call %struct.POD __spirv_SpecConstantComposite([2 x %struct.A] %gold_POD_A, i32 %gold_POD_b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only reason why __spirvCompositeSpecConstant accepted IDs instead of SSA values is a concern that there could be some non-trivia data flow (i.e. via store/load to alloca) of an operand, which might confuse the translator. If we can be sure that this will never happen, then using SSA with spec constant tree construction in IR seems fine.


```
%gold_POD = call %struct.POD __spirvCompositeSpecConstant<POD type mangling>(i32 10, i32 11, i32 12, i32 13, i32 14)
```

where `__spirvCompositeSpecConstant<POD type mangling>` is a new SPIR-V intrinsic which
represents creation of a composite specialization constant. Its arguments are spec
constant IDs corresponding to the leaf fields of the POD type of the constant.
Spec ID for the composite spec constant is not needed, as runtime will never use it - it will use IDs of the leaves instead.
Yet, the SPIR-V specification does not allow `SpecID` decoration for composite spec constants.
Spec ID for the composite spec constant is not needed, as runtime will never use
it - it will use IDs of the leaves instead.
Yet, the SPIR-V specification does not allow `SpecID` decoration for composite
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved
spec constants, because its defined by its members instead.

##### The post-link tool changes

For composite specialization constants the post link tool will additionally
generate linearized list of \<leaf spec ID,type,offset,size\> tuples (descriptors),
generate linearized list of \<leaf spec ID,offset,size\> tuples (descriptors),
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved
where each tuple describes a leaf field, and store it together with the
existing meta-information associated with the specialization constants and
passed to the runtime. Also, for a composite specialization constant there is
no ID map entry within the meta information, and the composite constant is
referenced by its symbolic ID. For example:

```
MyConst_mangled [10,int,0,4],[11,float,4,4],[12,int,8,4],[13,float,12,4],[14,int,16,4]
```

#### LLVMIR-\>SPIR-V translator
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved

The translator aims to create the following code (pseudo-code)

```
%gold_POD_a0x = OpSpecConstant(0) [SpecId = 10]
%gold_POD_a0y = OpSpecConstant(0.0f) [SpecId = 11]
%gold_POD_a1x = OpSpecConstant(0) [SpecId = 12]
%gold_POD_a1y = OpSpecConstant(0.0f) [SpecId = 13]
%gold_POD_b = OpSpecConstant(0) [SpecId = 14]

%gold_POD_a0 = OpSpecConstantComposite(
%gold_POD_a0x // gold.a[0].x
%gold_POD_a0y // gold.a[0].y
)

%gold_POD_a1 = OpSpecConstantComposite(
%gold_POD_a1x // gold.a[1].x
%gold_POD_a1y // gold.a[1].y
)

%gold_POD = OpSpecConstantComposite(
%gold_POD_a0,
%gold_POD_a1,
%gold_POD_b // gold.b
}
MyConst_mangled [10,0,4],[11,4,4],[12,8,4],[13,12,4],[14,16,4]
```

- First, `OpSpecConstant` instructions are created using already existing mechanism for
primitive spec constants.
- Then the translator will handle `__spirvCompositeSpecConstant*` intrinsic.
It will recursively traverse the spec constant type structure in parallel with
the argument list - which is a list of primitive spec constant SpecIds.
When traversing, it will create all the intermediate OpSpecConstantComposite
instructions as well as the root one (`%gold_POD`) using simple depth-first tree
traversal with stack. This requires mapping from SpecId decoration number to
\<id\> of the corresponding OpSpecConstant instruction, but this should be pretty
straightforward.

#### SYCL runtime

First, when the runtime loads a binary it gets access to specialization
Expand Down