Skip to content
Michael R. Crusoe edited this page May 2, 2024 · 37 revisions

Here we draft the release notes for the next release.

Note: format is [summary] [commit hash or PR#] [author(s)]

Use the release notes helper script to generate the preliminary list. Then group the changes and review the descriptions and look out for ????

Mostly the first line of the commit line is a good summary, but please think through each entry and (re)write a summary that helps users quickly determine if this change would be interesting/useful to them. For example, include the name of the intrinsic/function in the summary so that users don't have to click through each commit themselves.

SIMDe 0.8.2

Summary

Start of RISCV64 optimized implementation using the RVV1.0 vector extension! 62 of the ARM Neon intrinsics added in SIMDe 0.8.0 had to be removed for not exactly matching the specs and real hardware (from the FCVTZS/FCVTMS/FCVTPS/FCVTNS families). This brings us down from 100% coverage of the NEON functions to 99.07%.

Details

Implementation of Arm intrinsics

NEON

  • arm neon: disable some FCVTZS/FCVTMS/FCVTPS/FCVTNS family intrinsics 339ffe4 @mr-c
  • arm neon sm3: check constant range 3d34fcd @mr-c
  • arm 32 bits: native def fixes; workarounds for gcc 22900e6 @Cuda-Chen
  • x86 implementations: allow _m128 access from SSE 114c3cd @mr-c

WASM intrinsics

  • wasm x86 impl: some were incorrectly marked SSE instead of SSE2 fee149a @mr-c

x86 intrinsics

SVML

  • SSE is good enough for native m128i and m128d types & functions 9982b27 @mr-c

XOP

  • fix some native functions 608200b @mr-c

Arch support

arm / arm64

  • arm platform: cleanup feature detection. 08c21f3 @mr-c
  • arm: enable more intrinsic function for armv7 416091e @zengdage

RISCV64

  • Initial Support for the RISC-V Vector Extension (RVV1.0) in ARM NEON (#1130) b4e805a @eric900115
  • arm: fix some neon2rvv intrinsic function error 2a548e5 @zengdage
  • arm: Add neon2rvv support in vand series intrinsics dac67f3 @howjmay
  • arm: improve performance in vabd_xxx for risc-v b63ba04 @zengdage
  • arm: improve performance in vhadd_xxx for risc-v a68fa90 @zengdage

Compiler Specific

Clang

  • detect clang versions 18 & 19 ed4a5cd @mr-c
  • arm neon clang: skip vrnd native before clang v18 e647f10 @mr-c
  • apple clang arm64: ignore SHA2 be48ef8 @mr-c

Emscripten

  • use __builtin_roundeven{f,} from version 3.1.43 onwards 4379740 @mr-c

MSVC

  • x86 test msvc: really disable warning 4799,4730 487507d @mr-c
  • sse2 MSVC _mm_pause implementaiton for x86 8d95f83 @mr-c
  • SSE is good enough for native m128i and m128d types & functions 9982b27 @mr-c

Testing with Docker/Podman & CI

  • CI: don't run twice on dependabot branches 70748cd @mr-c
  • upgrade to clang-17 7ab3240 @mr-c
  • test Mac arm64 0080b28 @mr-c
  • macos: report log if there is a configuration failure. df3e930 @mr-c
  • build(deps): bump actions/checkout from 3 to 4 (#1149) 9605608 @dependabot[bot]
  • build(deps): bump codecov/codecov-action from 3 to 4 25382c1 @dependabot[bot]
  • codecov: use token 2c45dd4 @mr-c
  • Add gcc arm 32bit armv8-a test in CI 72bde75 @Cuda-Chen
  • build for AMD Buildozer version 2 9746537 @mr-c
  • Drop i386 (i686) support. (#1155) cf68aaf @junaruga
  • stop testing on GCC 5 & 6, clang 3.9 & 4 due to forced upgrade to Ubuntu 20.04 9982f10 @mr-c

Misc

  • update list of fully implemented instruction sets (#1152) b568fcd @mr-c
  • typo fixes from codespell 8639fef @mr-c
  • README.md - move CLMUL to partial, list more of the CI.yml architectures 285b50d @Torinde
  • Update README.md - link to VPCLMULQDQ; mention MSA (#1157) 517da84 @Torinde
  • Update README.md (#1156) b88a66d @mr-c
  • README: two more related projects 7429dff @mr-c
Template for next time

# Summary
## [X86](https://github.com/simd-everywhere/implementation-status/blob/main/x86.md)
### Newly added function families
### Additions to existing families
## [Neon](https://github.com/simd-everywhere/implementation-status/blob/main/neon.md)
## [MSA](https://github.com/simd-everywhere/implementation-status/blob/main/msa.md)
# Details
## Implementation of Arm intrinsics
### NEON
### SVE Intrinsics
## WASM intrinsics
## x86 intrinsics
### SSE*
### AVX
### AVX2
### AVX512
### GFNI 
### XOP
### F16C
### FMA
### SVML
## MIPS MSA intrinics
## Arch support
### arm64
### z/Arch
### Altivec
### e2k (Elbrus)
### Power
## Testing with Docker/Podman & CI
### [Appveyor](https://ci.appveyor.com/project/nemequ/simde/history)
### [Azure](https://dev.azure.com/simd-everywhere/SIMDe/_build?definitionId=3)
### [Circle CI](https://app.circleci.com/pipelines/github/simd-everywhere/simde)
### [Cirrus CI](https://cirrus-ci.com/github/simd-everywhere/simde)
### [Local testing with Docker/Podman](https://github.com/simd-everywhere/simde/tree/master/docker#readme)
### [Drone.io](https://cloud.drone.io/simd-everywhere/simde)
### [GitHub Actions](https://github.com/simd-everywhere/simde/actions)
### [Netlify](https://app.netlify.com/sites/simde/)
### [Packit CI](https://dashboard.packit.dev/projects/github.com/simd-everywhere/simde)
### [Semaphore CI](https://nemequ.semaphoreci.com/projects/simde)
### [Travis](https://app.travis-ci.com/github/simd-everywhere/simde)
## Misc