Skip to content
This repository has been archived by the owner on May 7, 2024. It is now read-only.

[RVP] add p extension spec 0.94 support #258

Open
wants to merge 44 commits into
base: riscv-gcc-experiment
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
40b707b
Initial support for P extension with spec 0.93
linsinan1995 May 31, 2021
a5a1cde
[MD] add inst: add* kadd* ukadd* sub* ksub* uksub* kabs*
linsinan1995 Apr 30, 2021
93a9b01
[MD] add inst: ave,clorv*,clrs,clz,bitrev*,cmpeq*,cmix
linsinan1995 Apr 30, 2021
0e57031
[MD] add inst: *cras* and *crsa* instructions
linsinan1995 Apr 30, 2021
a9ef042
[MD] add inst: insb, KDMBB, KDMBT, KDMTT, KDMABB, KDMABT, KDMATT, KHM…
linsinan1995 May 1, 2021
635cb16
[MD] add inst: KMADA, KMAXDA, KMADS, KMADRS, KMAXDS, KMAR64, KMMAC, K…
linsinan1995 May 1, 2021
57ef6f0
[MD] add inst: KSLL[i]W, KSLL[i]8, KSLL[i]16, KSLL[i]32, KSLRA8[.u], …
linsinan1995 May 1, 2021
42de5c6
[MD] add inst: kstas16, kstas32, kstsa16, kstsa32
linsinan1995 May 1, 2021
5cde2ca
[MD] add inst: KWMMUL[.u], MADDR32, MSUBR32, MULR64,
linsinan1995 May 1, 2021
b9b38e9
[MD] add inst: PBSAD, PBSADA, PKBB[16|32], PKBT[16|32], PKTT[16|32], …
linsinan1995 May 1, 2021
07a1cf9
[MD] add inst: [U]RADD[8|16|32|64|W], [U]RSUB[8|16|32|64|W]
linsinan1995 May 1, 2021
eb2deaa
[MD] add inst: RDOV, RSTAS[16|32], RSTSA[16|32]
linsinan1995 May 1, 2021
e417827
[MD] add inst: SCLIP8, SCLIP16, SCLIP32, SCMPLE8, SCMPLE16, SCMPLT8, …
linsinan1995 May 1, 2021
2fa7f89
[MD] add inst: SMALBB, SMALBT, SMALTT, SMAL
linsinan1995 May 1, 2021
1e47eff
[MD] add inst: SMALDA, SMALXDA, SMALDS, SMALDRS, SMALXDS
linsinan1995 May 2, 2021
ba8adde
[MD] add inst: SMAR64, UMAR64, [U]SMAX 8|16 and [U]SMIN 8|16
linsinan1995 May 2, 2021
3888d34
[MD] add inst: SMAQA, SMAQA.SU, UMAQA, UMAQA.SU
linsinan1995 May 2, 2021
7cba08f
[MD] add inst: SMBB 16|32, SMBT 16|32, SMTT 16|32
linsinan1995 May 2, 2021
b48226a
[MD] add inst: SMDS, SMDRS, SMXDS, SMDS32, SMDRS32, SMXDS32
linsinan1995 May 2, 2021
42a62ad
[MD] add inst: SMMUL[.u], SMMWB[.u], SMMWT[.u]
linsinan1995 May 2, 2021
9c682c9
[MD] add inst: SMSLDA, SMSLXDA, SMSR64, UMSR64
linsinan1995 May 2, 2021
25af87e
[MD] add inst: SMUL 8|16, SMULX 8|16, UMUL 8|16, UMULX 8|16
linsinan1995 May 2, 2021
b73c9e6
[MD] add inst: SRA[I] 8|16|32, SRL[I] 8|16, SRA[I] 8|16|32 .u, SRL[I]…
linsinan1995 May 2, 2021
7033bce
[MD] add inst: STAS 16|32, STSA 16|32
linsinan1995 May 3, 2021
a9d6ff8
[MD] add inst: SUNPKD810, SUNPKD820, SUNPKD830, SUNPKD831, SUNPKD832,…
linsinan1995 May 3, 2021
9985a3e
[MD] add inst: swap8, rev8.h
linsinan1995 May 3, 2021
f456d05
[MD] add inst: UCLIP8|16|32, UCMPLE8|16, UCMPLT8|16
linsinan1995 May 3, 2021
645fa05
[MD] add inst: ukmar64, ukmsr64, ukstas16|32, UKSTSA16|32
linsinan1995 May 3, 2021
3f95aa5
[MD] add inst: WEXTI, WEXT, URSTSA 16|32, URSTAS 16|32
linsinan1995 May 3, 2021
e83976e
[MD] add inst: KDMBB16, KDMBT16, KDMTT16, KDMABB16, KDMABT16, KDMATT16
linsinan1995 May 3, 2021
4fb52d3
[MD] add inst: KHMBB16, KHMBT16, KHMTT16, KMABB32, KMABT32, KMATT32, …
linsinan1995 May 3, 2021
c3279f8
[MD] add inst: KMADS32, KMADRS32, KMAXDS32, KMSDA32, KMSXDA32
linsinan1995 Oct 31, 2021
0fd4800
[MD] add inst: fsr, fsri, fsrw, rev
linsinan1995 Nov 9, 2021
068483d
[MD] add move pattern for vector mode
linsinan1995 Nov 9, 2021
c9b28ca
[Builtin] add intrinsic builtins
linsinan1995 Oct 31, 2021
877993c
[intrinsic header] Add interface for intrinsics
linsinan1995 Oct 31, 2021
c56f60c
[builtin] Add type conversion while expanding builtins
linsinan1995 May 10, 2021
6e85e99
[Hook] Add TARGET_VECTOR_MODE_SUPPORTED_P implementation
linsinan1995 May 10, 2021
3d67b3f
[Hook] Add even-odd pair of register support in TARGET_HARD_REGNO_MOD…
linsinan1995 May 10, 2021
da54bac
[testcases] add testcases
linsinan1995 Oct 31, 2021
f4b6264
rvp: fix wrong format in maddr32 and msubr32
linsinan1995 Apr 22, 2022
cd35a41
rvp: use gcc standard name for rv32p mac operation
linsinan1995 Apr 27, 2022
418ddbd
update testcases
linsinan1995 Apr 27, 2022
7eeadf6
rvp: update to spec 0.9.11
linsinan1995 Apr 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 6 additions & 0 deletions gcc/common/config/riscv/riscv-common.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ riscv_implied_info_t riscv_implied_info[] =
{"zks", "zksh"},
{"zks", "zkg"},
{"zks", "zkb"},
{"p", "zbpbo"},
{"p", "zpn"},
{"p", "zpsf"},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zprv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping here means the p flag in --with-arch=rv{xlen}gcp or -march=rv{xlen}gcp can be expanded to rv{xlen}gc_zpn_zpsf.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Still I think that zprv should appear on xlen=64.
When I run riscv64-unknown-elf-gcc -march=rv64imcbp -mabi=lp64 ..... on compiler with this PR, I get

rv64up.c: warning: implicit declaration of function '__rv__kadd32' [-Wimplicit-function-declaration]
    |   if (0x0000000000000000 != __rv__kadd32(0x0000000000000000, 0x0000000000000000)) return 536;
    |                             ^~~~~~~~~~~~

This is so because rv64imcbp expands to rv64imcb_zpn_zpsf whilest it should expand to rv64imcb_zpn_zpsf_zprv because zprv is required on xlen=64 by spec (I simplify a bit: I didn't expand b to its zeds here because it is not the issue).
If I do not treat warnings as errors, it obviously fails on linking with:

objs/rv64imcb@pcsh-bm1_rv64up/rv64up.o: in function `.L0 ': undefined reference to `__rv__kadd32'

I faced a more general problem with this PR: when I compile my code with riscv64-unknown-elf-gcc -march=rv64imcb -mabi=lp64 ..... (i.e. without p-ext) it fails on linking with this error:

objs/rv64imcb@pcsh-bm1_rv64up/rv64up.o: in function `.L0 ': undefined reference to `__builtin_riscv_kadd16'

I have my own p-ext intrinsic file (which is generally

#ifdef __riscv_zpn
static inline __rv__insn(uintXLEN_t rs1, uintXLEN_t rs2) { uintXLEN_t rd; __asm__("insn %0, %1, %2" : "=r"(rd) : "r"(rs1), "r"(rs2)); return rd; }
#else
static inline __rv__insn(uintXLEN_t rs1, uintXLEN_t rs2) { SOME_GENERIC_CODE; return rd; }
#endif

with different variations of types, number of arguments, clobbers and other stuff). I did so because I am not a pro at machine description language and also it seems to be a common way to implement intrinsics (e.g. https://github.com/rvkrypto/rvkrypto-fips/blob/main/rvkintrin.h).

So my intrinsics produce an assembler code with the specified insn if p-ext is enabled and make a call to (or inline -- depends on optimization levels) some generic-code functions otherwise.

When I found this PR, I expected it works the same way and in addition allows to optimize some generic code with emitting of p-ext insns.
As you mentioned in another topic, "generic code" should use very specific types defined in the same file where intrinsics are -- so we barely expect that such insns will appear on some code with common scalar types as b-ext insns do. It is ok for me: I understand that b-ext machine descriptions implement some simple stuff (e.g. bit set) and this PR operates with packed types so this is either very hard or impossible to map a set of scalar type insns to p-ext insn in most cases. Still I don't understand why intrinsics do not work (I mean do not provide generic code) without enabled p-ext and what should be done if I want to compile some file with the compiler which includes this intrinsic file without p-ext insns.

Copy link
Contributor Author

@linsinan1995 linsinan1995 Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @marcfedorow,

This is so because rv64imcbp expands to rv64imcb_zpn_zpsf whilest it should expand to rv64imcb_zpn_zpsf_zprv because zprv is required on xlen=64 by spec (I simplify a bit: I didn't expand b to its zeds here because it is not the issue).

Thank you for pointing it out. I added a fix to support a different expansion on p flag (rv32p will be expanded into rv32_zpn_zpsf, and rv64p will be rv64_zpn_zprv_zpsf).

with different variations of types, number of arguments, clobbers and other stuff). I did so because I am not a pro at machine description language and also it seems to be a common way to implement intrinsics (e.g. https://github.com/rvkrypto/rvkrypto-fips/blob/main/rvkintrin.h).
So my intrinsics produce an assembler code with the specified insn if p-ext is enabled and make a call to (or inline -- depends on optimization levels) some generic-code functions otherwise.

I think we do not have such a plan so far, since it requires a great amount of workload. I think it is a good idea though, so I will forward your message to other PLCT members in the further meetup.

As you mentioned in another topic, "generic code" should use very specific types defined in the same file where intrinsics are -- so we barely expect that such insns will appear on some code with common scalar types as b-ext insns do. It is ok for me: I understand that b-ext machine descriptions implement some simple stuff (e.g. bit set) and this PR operates with packed types so this is either very hard or impossible to map a set of scalar type insns to p-ext insn in most cases. Still I don't understand why intrinsics do not work (I mean do not provide generic code) without enabled p-ext and what should be done if I want to compile some file with the compiler which includes this intrinsic file without p-ext insns.

Vector type int16x4_t doesn't need to be from p-ext header, since it is native support in GCC (https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html). To use intrinsic might be the only solution if you try to use scalar type instead of vector type (e.g. put int16x4_t data onto int64_t field), but p-ext insn is possible to be generated from scalar code through auto-vectorization. e.g.

#include <rvp_intrinsic.h>
#include <stdint.h>

typedef short v4hi __attribute__((vector_size (8)));

v4hi v_sadd16_spn (v4hi ra, v4hi rb)
{
  return ra + rb;
}

int16_t *v_sadd16_arr (int16_t *ra, int16_t *rb, int len)
{
  for (int i = 0; i < len; i++)
	  ra[i] += rb[i];
  return ra;
}

riscv64-unknown-elf-gcc -S -O3 add16.c

	.file	"add16.c"
	.option nopic
	.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c0p0_p2p0_zpn2p0_zprv2p0_zpsf2p0"
	.attribute unaligned_access, 0
	.attribute stack_align, 16
	.text
	.align	1
	.globl	v_sadd16_spn
	.type	v_sadd16_spn, @function
v_sadd16_spn:
	add16	a0, a0, a1
	ret
	.size	v_sadd16_spn, .-v_sadd16_spn
	.align	1
	.globl	v_sadd16_arr
	.type	v_sadd16_arr, @function
v_sadd16_arr:
	ble	a2,zero,.L4
	addi	a5,a1,2
	addiw	a3,a2,-1
	sub	a5,a0,a5
	sext.w	a4,a3
	sltiu	a5,a5,5
	li	a6,5
	xori	a5,a5,1
	sgtu	a4,a4,a6
	and	a5,a4,a5
	sext.w	t1,a2
	beq	a5,zero,.L5
	or	a5,a1,a0
	andi	a5,a5,7
	bne	a5,zero,.L5
	srliw	a7,t1,2
	slli	a7,a7,3
	mv	a5,a0
	mv	a3,a1
	add	a7,a7,a0
.L6:
	ld	a4,0(a5)
	ld	a6,0(a3)
	addi	a5,a5,8
	addi	a3,a3,8
	add16	a4, a4, a6
	sd	a4,-8(a5)
	bne	a5,a7,.L6
	andi	a5,t1,-4
	mv	a4,a5
	beq	t1,a5,.L4
	slli	a5,a5,32
	srli	a5,a5,31
	add	a7,a0,a5
	add	a3,a1,a5
	lhu	a6,0(a3)
	lhu	t1,0(a7)
	addiw	a3,a4,1
	addw	a6,a6,t1
	sh	a6,0(a7)
	bge	a3,a2,.L4
	addi	a3,a5,2
	add	a6,a0,a3
	add	a3,a1,a3
	lhu	a3,0(a3)
	lhu	a7,0(a6)
	addiw	a4,a4,2
	addw	a3,a3,a7
	sh	a3,0(a6)
	bge	a4,a2,.L4
	addi	a5,a5,4
	add	a4,a0,a5
	add	a1,a1,a5
	lhu	a5,0(a1)
	lhu	a3,0(a4)
	addw	a5,a5,a3
	sh	a5,0(a4)
	ret
.L5:
	slli	a5,a3,32
	srli	a3,a5,31
	addi	a2,a0,2
	mv	a5,a0
	add	a2,a2,a3
.L8:
	lhu	a4,0(a5)
	lhu	a3,0(a1)
	addi	a5,a5,2
	addi	a1,a1,2
	addw	a4,a4,a3
	sh	a4,-2(a5)
	bne	a5,a2,.L8
.L4:
	ret
	.size	v_sadd16_arr, .-v_sadd16_arr
	.ident	"GCC: (GNU) 10.2.0"

I hope I have answered all your questions. Thanks again.

{NULL, NULL}
linsinan1995 marked this conversation as resolved.
Show resolved Hide resolved
};

Expand Down Expand Up @@ -833,6 +836,9 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
{"zksed", &gcc_options::x_riscv_crypto_subext, MASK_ZKSED},
{"zksh", &gcc_options::x_riscv_crypto_subext, MASK_ZKSH},

{"zbpbo", &gcc_options::x_riscv_rvp_subext, MASK_ZBPBO},
{"zpn", &gcc_options::x_riscv_rvp_subext, MASK_ZPN},
{"zpsf", &gcc_options::x_riscv_rvp_subext, MASK_ZPSF},
{NULL, NULL, 0}
};

Expand Down
1 change: 1 addition & 0 deletions gcc/config.gcc
Original file line number Diff line number Diff line change
Expand Up @@ -525,6 +525,7 @@ pru-*-*)
;;
riscv*)
cpu_type=riscv
extra_headers="rvp_intrinsic.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest rvpintrin.h similar to rvkintrin.h

extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o"
d_target_objs="riscv-d.o"
;;
Expand Down
70 changes: 70 additions & 0 deletions gcc/config/riscv/constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,73 @@
A constant @code{move_operand}."
(and (match_operand 0 "move_operand")
(match_test "CONSTANT_P (op)")))

(define_constraint "u02"
"Unsigned immediate 2-bit value"
(and (match_code "const_int")
(match_test "ival < (1 << 2) && ival >= 0")))

(define_constraint "u03"
"Unsigned immediate 3-bit value"
(and (match_code "const_int")
(match_test "ival < (1 << 3) && ival >= 0")))

(define_constraint "u04"
"Unsigned immediate 4-bit value"
(and (match_code "const_int")
(match_test "ival < (1 << 4) && ival >= 0")))

(define_constraint "u05"
"Unsigned immediate 5-bit value"
(and (match_code "const_int")
(match_test "ival < (1 << 5) && ival >= 0")))

(define_constraint "u06"
"Unsigned immediate 6-bit value"
(and (match_code "const_int")
(match_test "ival < (1 << 6) && ival >= 0")))

(define_constraint "C00"
"Constant value 0"
(and (match_code "const_int")
(match_test "ival == 0")))

(define_constraint "C01"
"Constant value 1"
(and (match_code "const_int")
(match_test "ival == 1")))

(define_constraint "C02"
"Constant value 2"
(and (match_code "const_int")
(match_test "ival == 2")))

(define_constraint "C03"
"Constant value 3"
(and (match_code "const_int")
(match_test "ival == 3")))

(define_constraint "C04"
"Constant value 4"
(and (match_code "const_int")
(match_test "ival == 4")))

(define_constraint "C08"
"Constant value 8"
(and (match_code "const_int")
(match_test "ival == 8")))

(define_constraint "D07"
"A constraint that matches the integers 2^(0...7)."
(and (match_code "const_int")
(match_test "(unsigned) exact_log2 (ival) <= 7")))

(define_constraint "C15"
"Constant value 15"
(and (match_code "const_int")
(match_test "ival == 15")))

(define_constraint "C16"
"Constant value 16"
(and (match_code "const_int")
(match_test "ival == 16")))
81 changes: 81 additions & 0 deletions gcc/config/riscv/predicates.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,3 +212,84 @@
{
return riscv_gpr_save_operation_p (op);
})

(define_predicate "imm2u_operand"
(and (match_operand 0 "const_int_operand")
(match_test "satisfies_constraint_u02 (op)")))

(define_predicate "imm3u_operand"
(and (match_operand 0 "const_int_operand")
(match_test "satisfies_constraint_u03 (op)")))

(define_predicate "imm4u_operand"
(and (match_operand 0 "const_int_operand")
(match_test "satisfies_constraint_u04 (op)")))

(define_predicate "imm5u_operand"
(and (match_operand 0 "const_int_operand")
(match_test "satisfies_constraint_u05 (op)")))

(define_predicate "imm6u_operand"
(and (match_operand 0 "const_int_operand")
(match_test "satisfies_constraint_u06 (op)")))

(define_predicate "rimm3u_operand"
(ior (match_operand 0 "register_operand")
(match_operand 0 "imm3u_operand")))

(define_predicate "rimm4u_operand"
(ior (match_operand 0 "register_operand")
(match_operand 0 "imm4u_operand")))

(define_predicate "rimm5u_operand"
(ior (match_operand 0 "register_operand")
(match_operand 0 "imm5u_operand")))

(define_predicate "rimm6u_operand"
(ior (match_operand 0 "register_operand")
(match_operand 0 "imm6u_operand")))

(define_predicate "const_insb64_operand"
(and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 7)")))

(define_predicate "imm_1_2_4_8_operand"
(and (match_operand 0 "const_int_operand")
(ior (ior (match_test "satisfies_constraint_C01 (op)")
(match_test "satisfies_constraint_C02 (op)"))
(ior (match_test "satisfies_constraint_C04 (op)")
(match_test "satisfies_constraint_C08 (op)")))))

(define_predicate "pwr_7_operand"
(and (match_code "const_int")
(match_test "INTVAL (op) != 0
&& (unsigned) exact_log2 (INTVAL (op)) <= 7")))

(define_predicate "imm_0_1_operand"
(and (match_operand 0 "const_int_operand")
(ior (match_test "satisfies_constraint_C00 (op)")
(match_test "satisfies_constraint_C01 (op)"))))

(define_predicate "imm_1_2_operand"
(and (match_operand 0 "const_int_operand")
(ior (match_test "satisfies_constraint_C01 (op)")
(match_test "satisfies_constraint_C02 (op)"))))

(define_predicate "imm_2_3_operand"
(and (match_operand 0 "const_int_operand")
(ior (match_test "satisfies_constraint_C02 (op)")
(match_test "satisfies_constraint_C03 (op)"))))

(define_predicate "imm_15_16_operand"
(and (match_operand 0 "const_int_operand")
(ior (match_test "satisfies_constraint_C15 (op)")
(match_test "satisfies_constraint_C16 (op)"))))

(define_predicate "rev_rimm_operand"
(ior (match_operand 0 "const_arith_operand")
(match_test "INTVAL (op) == (BITS_PER_WORD - 1)")))

(define_predicate "fsr_shamt_imm"
(ior (match_operand 0 "register_operand")
(and (match_operand 0 "const_arith_operand")
(match_test "IN_RANGE (INTVAL (op), 1, 31)"))))
Loading