Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: enable SIMD support for buffer swap #44793

Closed
wants to merge 1 commit into from
Closed

Conversation

lucshi
Copy link

@lucshi lucshi commented Sep 26, 2022

This patch is modified from #44578 after review comments accepted.

Comments -> solution
Patch is too big that add a deps folder -> I made all the code changes into the util-inl.h file, and did not touch any other folders and files.
Patch only adopted AVX512 Simd instruction which is not widespread -> I enabled ssse version and AVX512 version at the same time. So that all popular platforms can benefit from this patch. Most of known platforms support ssse
Do not modify the gyp file -> I removed all code changes in gyp file, and uses attribute in the source code file like deps/zlib does.

For all platforms, the most performance gains is >5X.

For the platforms supporting AVX512, it achieved best performance gain, the charts for comparison as below:
image
For the platforms supporting SSE only, it also achieved good performance, charts as below:
image

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Sep 26, 2022
@lucshi lucshi marked this pull request as ready for review September 26, 2022 08:24
@lucshi
Copy link
Author

lucshi commented Sep 26, 2022

Submit for CI first.

@tniessen tniessen added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2022
@nodejs-github-bot
Copy link
Collaborator

Copy link
Contributor

@mscdex mscdex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still -1 on this because I'm not convinced that swapping bytes is a common enough operation to warrant the level of optimization being made here

Additional issues:

  • Byte swapping functions called by all platforms are targeting avx512vbmi

@lucshi
Copy link
Author

lucshi commented Sep 26, 2022

I'm still -1 on this because I'm not convinced that swapping bytes is a common enough operation to warrant the level of optimization being made here

Additional issues:

  • Byte swapping functions called by all platforms are targeting avx512vbmi

I'm not sure the concern of code change here. Byte swap is a official Node.js API for all programmers. As long as the opt patch does not hurt current performance but benefit some platforms, I think it is good to have. Is there concerns for quality?

@lucshi
Copy link
Author

lucshi commented Sep 28, 2022

Found the review guideline, could you please check if this opt patch meets this?

Do not overwhelm new contributors.

It is tempting to micro-optimize and make everything about relative performance, perfect grammar, or exact style matches. Do not succumb to that temptation.

Focus first on the most significant aspects of the change:

Does this change make sense for Node.js?
Does this change make Node.js better, even if only incrementally?
Are there clear bugs or larger scale issues that need attending to?
Is the commit message readable and correct? If it contains a breaking change is it clear enough?

@bnoordhuis
Copy link
Member

Accepting changes is a tradeoff, that's mentioned elsewhere in doc/contributing/pull-requests.md.

I agree with @mscdex the benefits (performance) do not outweigh the downsides (complexity) for an uncommon operation. AVX-512 itself is still so uncommon it's probably not worth optimizing for yet.

Something I also touched upon in your other pull request is that I'm skeptical real world programs are going to see much improvement.

Most programs don't swap bytes 100% of the time and lightly sprinkling AVX-512 instructions throughout your instruction stream often reduces overall performance. It definitely increases overall power consumption.

@jasnell
Copy link
Member

jasnell commented Oct 2, 2022

I'm also skeptical here. While I appreciate the code change and the perf improvement for some cases, this definitely increases the complexity quite a bit for an operation that is often not in the hot path. I'm going to say -0 on this (slightly against but not going to block, but I'm not at all convinced there's enough value)

@lucshi
Copy link
Author

lucshi commented Oct 14, 2022

Accepting changes is a tradeoff, that's mentioned elsewhere in doc/contributing/pull-requests.md.

I agree with @mscdex the benefits (performance) do not outweigh the downsides (complexity) for an uncommon operation. AVX-512 itself is still so uncommon it's probably not worth optimizing for yet.

Something I also touched upon in your other pull request is that I'm skeptical real world programs are going to see much improvement.

Most programs don't swap bytes 100% of the time and lightly sprinkling AVX-512 instructions throughout your instruction stream often reduces overall performance. It definitely increases overall power consumption.

Could you please provide a real test case? I will try to reproduce your skepital.

@bnoordhuis
Copy link
Member

Some good candidates: tsc, webpack, ghost, express. You should check if they or their dependencies actually do any byte swapping.

(If none of them do, that's a good indicator it's not a hot path.)

@lucshi
Copy link
Author

lucshi commented Oct 17, 2022

Some good candidates: tsc, webpack, ghost, express. You should check if they or their dependencies actually do any byte swapping.

(If none of them do, that's a good indicator it's not a hot path.)

some of them like typescript and express are frameworks, and programmers can write any JS code based on them. When I search for typescript code snippets in Github that invoke buffer.swap32(), I could see many occurances. Is that a sign of the popularity of buffer wapping operations?

Search link: https://github.com/search?l=TypeScript&q=buffer.swap32&type=Code

@mscdex
Copy link
Contributor

mscdex commented Oct 17, 2022

I could see many occurances. Is that a sign of the popularity of buffer wapping operations?

Search link: https://github.com/search?l=TypeScript&q=buffer.swap32&type=Code

That link seems to show that the majority of results are just typescript definitions and not so much widespread code usage.

@lucshi
Copy link
Author

lucshi commented Oct 17, 2022

It's a bit nosense that so many projects define functions and not use them. I will look into the project repo code listed in the first result page and trace the usage of those definitions.

@jasnell
Copy link
Member

jasnell commented Oct 17, 2022

Closing as there's likely no action to take here.

@jasnell jasnell closed this Oct 17, 2022
@lucshi lucshi deleted the simd-swap branch December 4, 2023 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants