Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv_vector.h intrinsics should be target-gated, not preprocessor-gated #56592

Open
malaterre opened this issue Jul 18, 2022 · 10 comments
Open
Labels
backend:RISC-V clang:headers Headers provided by Clang, e.g. for intrinsics

Comments

@malaterre
Copy link

malaterre commented Jul 18, 2022

This is a complete duplicate of issue #56480 but for riscv-64.

Clang's intrinsics headers on Riscv-64 contain code like:

% head -20 /usr/lib/llvm-14/lib/clang/14.0.6/include/riscv_vector.h | tail -5

#ifndef __riscv_vector
#error "Vector intrinsics require the vector extension."
#endif

This means that one can only use Riscv-64 intrinsics in TUs that mark the feature as available for the entire intrinsic, e.g. via -march flags. In contrast, the x86 intrinsics are consistently defined, but tagged with __attribute__((__target__("whatever"))):

See also:

@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2022

@llvm/issue-subscribers-backend-risc-v

@kito-cheng
Copy link
Member

RISC-V didn't have define formal spec for __attribute__((__target__("whatever"))) yet, but isn't highly dependent on this I think, current I am trying to improve the intrinsic header by pragma, and that could be resolve easier once this landing.

@jan-wassenberg
Copy link

I'll be happy to update Highway for this once it lands. (This will enable runtime dispatch, i.e. only using V when it is supported, which allows deployment to any CPU without having to assume that it support V.)

@jan-wassenberg
Copy link

Any updates on this? We are seeing build errors. It seems our internal riscv_vector.h lacks the

#ifndef __riscv_vector
#error "Vector intrinsics require the vector extension."
#endif

, which is good, whereas clang 16-18 headers still appear to have it. Would be great if the #error could be removed.

The previously mentioned "Lazily add RVV C intrinsics." has indeed landed :)

@dzaima
Copy link

dzaima commented Aug 6, 2024

13b653a#diff-cad18d641a356be425d3530df6de7522de416ec8c08aed9ed751aa3e182e5e0a (clang ≥19) removed the #error, which should mean this can be closed?

@jan-wassenberg
Copy link

Thanks for the heads-up. We're adding the required target attributes in the above change, would be good to verify it works in Godbolt before we consider this closed.
Godbolt usually updates overnight, so I'll check back tomorrow.

copybara-service bot pushed a commit to google/highway that referenced this issue Aug 7, 2024
copybara-service bot pushed a commit to google/highway that referenced this issue Aug 7, 2024
copybara-service bot pushed a commit to google/highway that referenced this issue Aug 7, 2024
copybara-service bot pushed a commit to google/highway that referenced this issue Aug 7, 2024
@dzaima
Copy link

dzaima commented Aug 7, 2024

Oh, looks like that the e64 & e8mf8/e16mf4/e32mf2 setvl/setvlmax intrinsics are still preprocessor-gated: https://godbolt.org/z/a6o8onbzs

Works with just manually copy-pasting them from the header, with slightly weird but understandable error messages where applicable: https://godbolt.org/z/rrvx6rof8

#if __riscv_v_elen >= 64
#define __riscv_vsetvl_e8mf8(avl) __builtin_rvv_vsetvli((size_t)(avl), 0, 5)
#define __riscv_vsetvl_e16mf4(avl) __builtin_rvv_vsetvli((size_t)(avl), 1, 6)
#define __riscv_vsetvl_e32mf2(avl) __builtin_rvv_vsetvli((size_t)(avl), 2, 7)
#define __riscv_vsetvl_e64m1(avl) __builtin_rvv_vsetvli((size_t)(avl), 3, 0)
#define __riscv_vsetvl_e64m2(avl) __builtin_rvv_vsetvli((size_t)(avl), 3, 1)
#define __riscv_vsetvl_e64m4(avl) __builtin_rvv_vsetvli((size_t)(avl), 3, 2)
#define __riscv_vsetvl_e64m8(avl) __builtin_rvv_vsetvli((size_t)(avl), 3, 3)
#endif
#define __riscv_vsetvlmax_e8mf4() __builtin_rvv_vsetvlimax(0, 6)
#define __riscv_vsetvlmax_e8mf2() __builtin_rvv_vsetvlimax(0, 7)
#define __riscv_vsetvlmax_e8m1() __builtin_rvv_vsetvlimax(0, 0)
#define __riscv_vsetvlmax_e8m2() __builtin_rvv_vsetvlimax(0, 1)
#define __riscv_vsetvlmax_e8m4() __builtin_rvv_vsetvlimax(0, 2)
#define __riscv_vsetvlmax_e8m8() __builtin_rvv_vsetvlimax(0, 3)
#define __riscv_vsetvlmax_e16mf2() __builtin_rvv_vsetvlimax(1, 7)
#define __riscv_vsetvlmax_e16m1() __builtin_rvv_vsetvlimax(1, 0)
#define __riscv_vsetvlmax_e16m2() __builtin_rvv_vsetvlimax(1, 1)
#define __riscv_vsetvlmax_e16m4() __builtin_rvv_vsetvlimax(1, 2)
#define __riscv_vsetvlmax_e16m8() __builtin_rvv_vsetvlimax(1, 3)
#define __riscv_vsetvlmax_e32m1() __builtin_rvv_vsetvlimax(2, 0)
#define __riscv_vsetvlmax_e32m2() __builtin_rvv_vsetvlimax(2, 1)
#define __riscv_vsetvlmax_e32m4() __builtin_rvv_vsetvlimax(2, 2)
#define __riscv_vsetvlmax_e32m8() __builtin_rvv_vsetvlimax(2, 3)
#if __riscv_v_elen >= 64
#define __riscv_vsetvlmax_e8mf8() __builtin_rvv_vsetvlimax(0, 5)
#define __riscv_vsetvlmax_e16mf4() __builtin_rvv_vsetvlimax(1, 6)
#define __riscv_vsetvlmax_e32mf2() __builtin_rvv_vsetvlimax(2, 7)
#define __riscv_vsetvlmax_e64m1() __builtin_rvv_vsetvlimax(3, 0)
#define __riscv_vsetvlmax_e64m2() __builtin_rvv_vsetvlimax(3, 1)
#define __riscv_vsetvlmax_e64m4() __builtin_rvv_vsetvlimax(3, 2)
#define __riscv_vsetvlmax_e64m8() __builtin_rvv_vsetvlimax(3, 3)
#endif

@dzaima
Copy link

dzaima commented Aug 7, 2024

Additionally, __riscv_v_intrinsic is also only enabled if vector is globally enabled: https://godbolt.org/z/ndTf1YM6M; gcc falls into that trap too. (unlike __riscv_v_elen & co, __riscv_v_intrinsic should always have the same value so should be safe globally)

@jan-wassenberg
Copy link

Bummer, thanks for checking :/ Do you know of anyone working on/towards this?

@dzaima
Copy link

dzaima commented Aug 20, 2024

Haven't noticed anything (but I'm just browsing random issues every now and then); @topperc / @4vtomat?

aarongable pushed a commit to chromium/chromium that referenced this issue Oct 16, 2024
https://chromium.googlesource.com/external/github.com/google/highway.git/+log/8295336dd70f..a97b5d371d69

$ git log 8295336dd..a97b5d371 --date=short --no-merges --format='%ad %ae %s'
2024-10-15 janwas Wrap tests in anon namespace to enable integration in Chromium
2024-10-14 janwas fix topology detection on RVV, where LPs can be under-reported
2024-10-10 janwas fix missing cast
2024-10-10 janwas update MODULE.bazel versions, add missing platforms
2024-10-10 janwas Minor formatting + DASSERT fixes
2024-10-08 janwas add NUMA node to topology
2024-10-07 janwas fix IsNaN codegen for clang < 18.1. Refs numpy/numpy/issues/27313
2024-10-06 siruilu use 1/8 LMUL to process Eq128Upper and Ne128Upper mask
2024-10-06 siruilu use 1/8 LMUL to process Ne128 mask
2024-10-06 siruilu use 1/8 LMUL to process Eq128 mask
2024-10-06 siruilu use 1/8 LMUL to process Lt128Upper mask
2024-09-17 raj.khem Add cmake check for deducing 32bit or 64bit RISCV
2024-10-04 janwas Fix profiler ubsan crash
2024-10-03 no-reply Fix HWY_RCAST_ALIGNED
2024-10-01 john_platts Fixed compilation error in sort_test
2024-09-30 janwas fix Github Action failure: incorrect dependency and cast
2024-09-30 siruilu use HWY_MAX instead of std::max
2024-09-30 janwas fix printf specifier warnings when profiler enabled
2024-09-27 siruilu reduce LMUL for Dup128MaskFromMaskBits
2024-09-19 janwas update Highway intro slides
2024-09-18 janwas disable RVV for older GCC/Clang. Fixes #2328, thanks @johnplatts
2024-09-10 janwas add test for already sorted inputs
2024-09-09 janwas fix NumWorkers: main thread also participates.
2024-09-09 janwas log transform for less overflow in GeometricMean
2024-09-06 john_platts Made unit testing enhancements if HWY_TEST_STANDALONE is 1
2024-09-03 john_platts Fixes for Clang 19 bugs on the Z14/Z15 targets
2024-09-03 janwas consistent stats.cc %e formatting for Mean/SD
2024-08-27 janwas disable runtime dispatch experiment on RVV, clang 19 is not yet ready. Fixes #2311
2024-08-27 janwas add vectorlite user
2024-08-26 janwas remove finished TODO
2024-08-20 paulchang Temporarily disable tests for Load/StoreInterleaved of special floats on arm7
2024-08-20 k0tran Add RVV linker flag in case of building with RVV
2024-08-19 janwas fix abort_test for limited Windows regex. Refs mesonbuild/wrapdb#1611
2024-08-16 paulchang Provide generic emulated Load/StoreInterleaved for special floats
2024-08-15 john_platts Added arithmetic assignment operators from F16/BF16
2024-08-13 john_platts Fixes to hwy/base.h and vqsort-inl.h
2024-08-13 doak Minor QoL improvements.
2024-08-12 john_platts Enabled F64 on WASM_EMU256 target
2024-08-12 janwas update user list from sourcegraph
2024-08-10 john_platts Enable AVX3_SPR F16 support with Clang 19.1 or later
2024-08-08 janwas also enable CompressStore workaround for SPR
2024-08-07 john_platts Made fixes to hwy::IsInteger
2024-08-07 janwas Experimental support for RVV runtime dispatch in clang19, refs llvm/llvm-project#56592
2024-08-02 janwas document SVE operator< issue
2024-08-02 janwas split sort_test
2024-08-02 janwas fix remaining unsafe BlendedStore in vqsort
2024-07-29 janwas add gemma.cpp to Highway users
2024-07-29 janwas add 8-bit test case for vqsort to cover MaybePartitionTwoValue. Refs #2281
2024-07-26 chris Replace BlendedStore at edges with StoreN
2024-07-23 john_platts Fix compilation errors on 32-bit PowerPC
(...)
2024-06-12 janwas add FAQ on HWY_RESTRICT
2024-06-11 janwas split up logical test into if_test and sign_test for GCC. fixes #2240, thanks @malaterre
2024-06-11 john_platts Added Clang 16/17/18 and GCC 13 to GitHub workflow
2024-06-11 john_platts Added x86_32 GCC 12/Clang 15 to GitHub workflow
2024-06-10 john_platts Fixed warning in TestForeach in transform_test.cc
2024-06-10 john_platts Made enhancements to Exp2
2024-06-07 49699333+dependabot[bot] Bump step-security/harden-runner from 2.8.0 to 2.8.1
2024-06-05 paulchang Improve testing of transform edge cases
2024-06-03 janwas opt-in to reduce set of MSVC targets. Fixes #2220
2024-06-03 janwas clarify ZEN4 enabling
2024-06-03 jdorfman Internal
2024-06-03 john_platts Made fixes to generic_ops-inl.h BitShuffle impl on big-endian
2024-06-03 janwas Disable RVV runtime dispatch. Fixes #2227
2024-06-03 janwas update signing process post 1.2 - use git archive result
2024-06-03 janwas fix incorrect DLLEXPORT, fixes #2225
2024-06-03 janwas mention key server for signing, fixes #2224
2024-05-31 john_platts Added DemoteToNearestInt and F16/F64 NearestInt ops
2024-05-31 janwas remove HWY_HAS_INCLUDE - mishandled by clang
2024-05-31 janwas fix RVV TestPartition - cap vector as in the full sort. Fixes #2078
2024-05-30 janwas add HWY_DYNAMIC_POINTER_T for use with HWY_EXPORT_T
2024-05-30 janwas fix RVV missing macro, simplify D args
2024-05-30 janwas fix ppc 32-bit build, which lacks __int128
2024-05-29 mathieu.malaterre Fix compilation of shared libraries
2024-05-29 janwas split arithmetic_test into div/saturated. Fixes #2208
2024-05-29 mathieu.malaterre Allow compilation with -Werror on GNU/Hurd system
2024-05-29 janwas move bit_set to core
2024-05-28 janwas fix non-Apple bf16 detection, fixes #2199, thanks @jgouly
2024-05-28 janwas doc: define lanes, mention new targets and ConvertScalarTo
2024-05-27 janwas temporarily delete codeql - failing due to disk space.
2024-05-24 janwas Clang RVV workarounds
2024-05-23 janwas GCC cast fix
2024-05-23 john_platts Fix ICX detection
2024-05-23 janwas Fix bf16 target and ICC compiler detection
2024-05-22 john_platts Updated Undefined implementation on NEON/PPC/Z14
2024-05-22 janwas disable failing GCC 13 RVV tests
2024-05-22 janwas Re-enable float to int test; UB is fixed in #2189
2024-05-22 49699333+dependabot[bot] --- updated-dependencies: - dependency-name: step-security/harden-runner   dependency-type: direct:production   update-type: version-update:semver-minor ...
2024-05-22 padogra Add frame of reference to Pack32/Pack64
2024-05-21 49699333+dependabot[bot] --- updated-dependencies: - dependency-name: github/codeql-action   dependency-type: direct:production   update-type: version-update:semver-patch ...
2024-05-20 john_platts Workaround for UB in x86 ConvertInRangeTo with GCC
2024-05-21 janwas simplify interleaved_test, fixes GCC AVX2/3
2024-05-17 no-reply re-enable vqsort for sanitizers on arm
2024-05-17 janwas fix GCC warnings (cast after arithmetic, but not for Iota)
2024-05-17 janwas avoid GCC "UB" in truncating cases. Fixes #2183
2024-05-17 janwas internal test infra
2024-05-16 janwas Clang RVV fixes:
2024-05-16 john_platts Updated CMakeLists.txt to compile in C++17 mode by default
2024-05-15 john_platts Fixed compilation errors with HwyInterleavedTest on x86
2024-05-10 john_platts Added support for dynamic dispatch for macOS/iOS/iPadOS on AArch64
2024-04-10 siruilu Add the synthesized, efficient Lt128 implementation.

Created with:
  roll-dep src/third_party/highway/src

Bug: 372239930
Change-Id: Ifc034f7f11b444bacdb4e244e8cdf0b47a6c0440
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5935891
Reviewed-by: Nico Weber <[email protected]>
Commit-Queue: Arthur Eubanks <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1369232}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:headers Headers provided by Clang, e.g. for intrinsics
Projects
None yet
Development

No branches or pull requests

6 participants