-
Notifications
You must be signed in to change notification settings - Fork 3
Expressions
CoreDSL 2 inherits the expression syntax and operator precedence from the C language. Below, we introduce new operators and adapt the type rules for the arbitrary-precision integer (APInt) types.
Please also note the discussion of the effects on the associativity of APInt operations.
We reuse the normal subscript operator [ ]
to denote a single bit access to a scalar value.
For a value of width w, index 0 refers to the least-significant bit; index w-1 to the most-significant bit.
The index expression must be a constant expression producing an unsigned value strictly less than the width of the scalar value.
The operator always yields an lvalue/rvalue of type unsigned<1>
.
Examples:
unsigned<5> x = 5'b10101;
x[1] = x[4]; // x == 5'b10111
We introduce a new operator at the same precedence level as the subscript operator, and with the following syntax:
PostfixExpression ::= '[' Expression ':' Expression ']'
The operator represents a range denoted by colon-separated operands from
and to
.
The size of the range must be compile-time evaluable.
To that end, the operands must adhere to one of the following cases:
- Both operands are constant expressions, i.e. they are comprised only of parameters and literals. Then size =
|from - to| + 1
-
from
is an identifier, andto
is of the formfrom ('+'|'-') offset
, whereoffset
is a constant expression. Then size =|offset| + 1
. -
to
is an identifier, andfrom
is of the formto ('+'|'-') offset
, whereoffset
is a constant expression. Then size =|offset| + 1
.
The operator is applicable to address space references, and expressions evaluating to a scalar value. The semantics and return types for both variants are detailed below. Note that the syntax can also be used to produce lvalues for assignments.
Let AS
denote an address space of elements with width w.
Then AS[from : to]
retrieves and concatenates size-many consecutive elements from the address space, returning a value of type unsigned<
size * w>
.
The element at index from
contributes the most significant bits to the result, whereas the element at index to
contribues the least significant bits.
In other words:
- If
from
>to
: Little-endian - If
from
<to
: Big-endian - If
from
==to
: Single-element access, equivalent toAS[from]
If the range contains out-of-bounds elements for the given address space, the result is undefined.
Examples
// architectural state
extern unsigned<8> MEM[33'd1 << 32];
// instruction behavior
unsigned<16> load_halfword_LE = MEM[addr+1 : addr];
unsigned<32> load_word_BE = MEM[addr : addr+3];
Let val
be a scalar value val
with width w.
Then val[from : to]
extracts size-many bits from val
and returns an unsigned<
size>
value.
- If
from
>to
: concatenation of bitsval[from]
, ...,val[to]
- If
from
<to
: result value: concatenation of bitsval[to]
, ...,val[from]
, i.e. order is reversed - If
from
==to
: equivalent toval[from]
If the range contains out-of-bounds bits for the given value, the result is undefined.
Examples:
unsigned<5> x; unsigned<3> y; unsigned<4> z;
x = 5'b11000;
x[1:0] = x[4:3]; // x == 5'b11011
y = x[0:2]; // y == 3'b110
for (x = 31; x > 3; x -= 4)
z = 32'hDEADBEEF[x:x-3];
// z == 0xD, 0xE, 0xA, ...
The new concatenation operator ::
fits in, precedence-wise, between the bitwise OR and the logical AND operators:
ConcatExpression ::= InclusiveOrExpression
| ConcatExpression `::` InclusiveOrExpression
LogicalAndExpression ::= ConcatExpression
| LogicalAndExpression `&&` ConcatExpression
Given an application of the operator E1 :: E2
, with E1
producing a value of w1 bits and E2
producing a value of w2 bits, the result is a value of type unsigned<w1 + w2>
, with the w1 most-significant bits corresponding to E1
, and the w2 least-significant bits corresponding to E2
.
Examples:
unsigned<32> ieee_minus_one = 1 :: 8'd127 :: 23'b0;
The basic idea here is that a result type is chosen based on the operand type(s) that guarantees no loss of precision or sign information. The operands are converted to that result type first.
We use the following declarations when presenting the type rules:
unsigned<w1> u1; unsigned<w2> u2;
signed<w1> s1; signed<w2> s2;
Expression | Result type |
---|---|
-u1 |
signed<wr> , wr = w1 + 1 |
-s1 |
signed<wr> , wr = w1 + 1 |
~u1 |
unsigned<wr> , wr = w1 |
~s1 |
signed<wr> , wr = w1 |
!u1 |
unsigned<1> |
!s1 |
unsigned<1> |
Expression | Result type |
---|---|
u1 + u2 |
unsigned<wr> , wr = max(w1, w2) + 1 |
s1 + s2 |
signed<wr> , wr = max(w1, w2) + 1 |
s1 + u2 |
signed<wr> , wr = max(w1, w2 + 1) + 1 |
u1 + s2 |
signed<wr> , wr = max(w1 + 1, w2) + 1 |
Expression | Result type |
---|---|
u1 - u2 |
signed<wr> , wr = max(w1 + 1, w2 + 1) |
s1 - s2 |
signed<wr> , wr = max(w1 + 1, w2 + 1) |
s1 - u2 |
signed<wr> , max(w1, w2 + 1) + 1 |
u1 - s2 |
signed<wr> , max(w1 + 1, w2) + 1 |
Expression | Result type |
---|---|
u1 * u2 |
unsigned<wr> , wr = w1 + w2
|
s1 * s2 |
signed<wr> , wr = w1 + w2
|
s1 * u2 |
signed<wr> , wr = w1 + w2
|
u1 * s2 |
signed<wr> , wr = w1 + w2
|
Expression | Result type |
---|---|
u1 / u2 |
unsigned<wr> , wr = w1
|
s1 / s2 |
signed<wr> , wr = w1 + 1 |
s1 / u2 |
signed<wr> , wr = w1
|
u1 / s2 |
signed<wr> , wr = w1 + 1 |
Expression | Result type |
---|---|
u1 % u2 |
unsigned<wr> , wr = min(w1, w2) |
s1 % s2 |
signed<wr> , wr = min(w1, w2) |
s1 % u2 |
signed<wr> , wr = min(w1, w2 + 1) |
u1 % s2 |
unsigned<wr> , wr = min(w1, max(1, w2 - 1)) |
The %
operator is defined (as in C) to satisfy the following formula:
a = ⌊a/b⌋ * b + a%b
<=> a%b = a - ⌊a/b⌋ * b
As a signed<1>
divisor can only be -1 or 0, the only non-error outcome in the last case is 0. To retain consistency with the type rules for the literal 0, the result type is special-cased to unsigned<1>
.
Let X
be either &
, |
or ^
.
Expression | Result type |
---|---|
u1 X u2 |
unsigned<wr> , wr = max(w1, w2) |
s1 X s2 |
signed<wr> , wr = max(w1, w2) |
s1 X u2 |
signed<wr> , wr = max(w1, w2) |
u1 X s2 |
signed<wr> , wr = max(w1, w2) |
Note that differently-sized operands are allowed and subject to the usual casting rules, mostly for consistency with the other arithmetic operations.
Let X
be either <<
or >>
.
Expression | Result type |
---|---|
u1 X u2 |
unsigned<wr> , wr = w1
|
s1 X s2 |
signed<wr> , wr = w1
|
s1 X u2 |
signed<wr> , wr = w1
|
u1 X s2 |
unsigned<wr> , wr = w1
|
Right shifts (>>
) are arithmetic right shifts if the left operand is signed, and logical right shifts otherwise.
The shift amount is not wrapped/truncated — if it exceeds the bit-size of the left operand, all of the left operand's bits are shifted "out of" the result value.
If the shift amount is negative, the shift direction is flipped, e.g. x << (-k)
⇔ x >> k
and vice versa.
The logical operators produce values of type bool
, i.e. unsigned<1>
.
The comparison operators ==
, !=
, <
, >
, <=
, >=
are defined for all integer types and always produce a bool
(unsigned<1>
) value.
They perform a comparison based on the value represented by the operands, and do not take the operand types into account. For example, 3'sd1 == 17'b1
shall return true
.
The conditional operator, cond ? trueExpr : falseExpr
, expects cond
to be an unsigned<1>
value. Its result type is the smallest type to which trueExpr
and falseExpr
can be implicitly cast.
For simple assignments x = expr
, the casting rules hold.
Combined assignments x op= expr
for variable x
of type T
are valid iff expr
is implicitly convertible to T
. They are then evaluated as x = (T) x op (expr)
.
Note that explicit cast, which means that the result of x op (expr)
may be silently truncated.
unsigned<10> x = ...;
x += 10'd1; // OK (unsigned<11> intermediate value is truncated)
x += 11'd1; // Error! Cannot assign a 11-bit value to `x` even without considering the operation
These operators yield the same type as the variable they are applied to. Implicit truncation may occur.
The following built-in declarations provide auxiliary functions that do not warrant a dedicated syntax.
__static_assert(expr)
throws an error if expr
cannot be evaluated to a non-zero integer value at compile time.
This intrinsic can only be used inside an architectural_state
section.
The canonical use-cases is to allow ISA extensions to impose constraints on the elaborated values of parameters declared further up in the hierarchy, e.g.:
InstructionSet BASE {
architectural_state {
unsigned int XLEN;
...
}
...
}
InstructionSet MY_EXT extends BASE {
architectural_state {
// this extension can only be added to 32-bit cores
__static_assert(XLEN == 32);
...
}
...
}
Core A provides BASE, MY_EXT {
architectural_state { XLEN = 32; } // OK!
}
Core B provides BASE, MY_EXT {
architectural_state { XLEN = 64; } // Can't use MY_EXT here.
}
bitsizeof(T)
returns the minimum number of bits to store a value of type T
, i.e. without considering padding or alignment.
For an expression E
that evaluates to a value of type T'
, bitsizeof(E)
is defined as bitsizeof(T')
.
For a type or expression X
, sizeof(X)
is defined as (bitsizeof(X) + 7) / 8
.
Both intrinsics return compile-time constants, using the unsigned type with the minimal width required to represent the value.
bitsizeof(signed<17>) // = 17
bitsizeof(struct {unsigned<1> b; signed<2> c; }) // = 3
bitsizeof(3 + 4 + 5) // = 5
sizeof(unsigned<42>) // = 6
offsetof
and bitoffsetof
are reserved for future use.
In an instruction's behavior, __encoding_size
is a compile-time constant denoting the number of bits in the instruction's encoding:
specification, using the unsigned type with the minimal width required to represent the value.
In an always-block, __encoding_size
represents the width of the last instruction word that was fetched prior to the execution of the always block.
The intrinsic's return type is unsigned<16>
.
The canonical example for using __encoding_size
is modelling the implicit PC increment for a RISC-V core with support for compressed instructions:
always {
implicit_pc {
// to be overriden by branch instructions and other always-blocks
PC += __encoding_size >> 3;
}
}