Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simd: add new SIMD support for JSON escaping #9500

Merged
merged 8 commits into from
Oct 25, 2024
Merged

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Oct 18, 2024

For cases where JSON encoding is needed, if SIMD is available, this brings 30%-50% performance improvement.

PostgreSQL SIMD

The header file with SIMD functionality has been taken from the PostgreSQL project, stripped down, and adapted for our specific needs.

Notes on other improvements

  • the routine that is used to escape characters, now uses a lookup table which heavily improves performance when Fluent Bit is built on release mode (optimizations on). This brings performance improvements for all systems/architectures.

  • SIMD operations are available for architectures that implement SSE2 (Intel/AMD) and Neon (Arm) based instructions. Note that AVX2 is not implemented so there is still more room for improvement.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

The following change in the utils write utility, improve the handling
of characters that needs escaping by optimizing the character check
with a lookup table.

Signed-off-by: Eduardo Silva <[email protected]>
src/flb_utils.c Fixed Show fixed Hide fixed
@edsiper
Copy link
Member Author

edsiper commented Oct 20, 2024

@pwhelan @leonardo-albertovich @cosmo0920 @pwhelan @patrick-stephens I need your help on this for workflows and overall testing:

  • I have introduced a new CMake option called FLB_SIMD (default: off)
  • SIMD operations are supported for x86_64, amd64 (SSE2) and aarch64 (Neon)
  • SIMD is enabled at build time and the backend selected per compiler definitions
  • A fallback mechanism exists if SIMD is not available of if is disabled (-DFLB_SIMD=Off).

so:

  • is there a chance a special architecture don't allow SIMD operations event those are supported ?
  • would be possible to ship this as a default/enabled on certain builds like containers without introducing any potential breaking change ?

comments are welcome

@cosmo0920
Copy link
Contributor

is there a chance a special architecture don't allow SIMD operations event those are supported ?

The one of the candidates is RISC-V vector extension("RVV"):
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc

@cosmo0920
Copy link
Contributor

cosmo0920 commented Oct 21, 2024

  • would be possible to ship this as a default/enabled on certain builds like containers without introducing any potential breaking change ?

For safety, we might need to have two images for PC architecture that is:

  • x86_64-generic means without SIMD support
  • x86_64-simd means with SIMD support

PC architectures are fragmented.
So, if we enforce to use SIMD support in our container images, illegal instruction errors might be happened in the ancient instances/boxes in unsupported SIMD environments..
But, SSE2 has been supported since Intel Pentium or AMD Athlon 64. So, they have 20 years history.

@edsiper edsiper added this to the Fluent Bit v3.2.0 milestone Oct 24, 2024
@edsiper
Copy link
Member Author

edsiper commented Oct 25, 2024

thanks for the feedback. Merging it for now since the feature needs to be enabled at build time

@edsiper edsiper merged commit 12cb22e into master Oct 25, 2024
49 checks passed
@edsiper edsiper deleted the utils-pack-simd branch October 25, 2024 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants