src: enable SIMD support for buffer swap #44793

lucshi · 2022-09-26T08:05:52Z

This patch is modified from #44578 after review comments accepted.

Comments -> solution
Patch is too big that add a deps folder -> I made all the code changes into the util-inl.h file, and did not touch any other folders and files.
Patch only adopted AVX512 Simd instruction which is not widespread -> I enabled ssse version and AVX512 version at the same time. So that all popular platforms can benefit from this patch. Most of known platforms support ssse
Do not modify the gyp file -> I removed all code changes in gyp file, and uses attribute in the source code file like deps/zlib does.

For all platforms, the most performance gains is >5X.

For the platforms supporting AVX512, it achieved best performance gain, the charts for comparison as below:

For the platforms supporting SSE only, it also achieved good performance, charts as below:

lucshi · 2022-09-26T08:25:21Z

Submit for CI first.

nodejs-github-bot · 2022-09-26T10:29:08Z

CI: https://ci.nodejs.org/job/node-test-pull-request/46812/

mscdex

I'm still -1 on this because I'm not convinced that swapping bytes is a common enough operation to warrant the level of optimization being made here

Additional issues:

Byte swapping functions called by all platforms are targeting avx512vbmi

lucshi · 2022-09-26T11:24:25Z

I'm still -1 on this because I'm not convinced that swapping bytes is a common enough operation to warrant the level of optimization being made here

Additional issues:

Byte swapping functions called by all platforms are targeting avx512vbmi

I'm not sure the concern of code change here. Byte swap is a official Node.js API for all programmers. As long as the opt patch does not hurt current performance but benefit some platforms, I think it is good to have. Is there concerns for quality?

lucshi · 2022-09-28T01:33:04Z

Found the review guideline, could you please check if this opt patch meets this?

Do not overwhelm new contributors.

It is tempting to micro-optimize and make everything about relative performance, perfect grammar, or exact style matches. Do not succumb to that temptation.

Focus first on the most significant aspects of the change:

Does this change make sense for Node.js?
Does this change make Node.js better, even if only incrementally?
Are there clear bugs or larger scale issues that need attending to?
Is the commit message readable and correct? If it contains a breaking change is it clear enough?

bnoordhuis · 2022-09-28T07:12:51Z

Accepting changes is a tradeoff, that's mentioned elsewhere in doc/contributing/pull-requests.md.

I agree with @mscdex the benefits (performance) do not outweigh the downsides (complexity) for an uncommon operation. AVX-512 itself is still so uncommon it's probably not worth optimizing for yet.

Something I also touched upon in your other pull request is that I'm skeptical real world programs are going to see much improvement.

Most programs don't swap bytes 100% of the time and lightly sprinkling AVX-512 instructions throughout your instruction stream often reduces overall performance. It definitely increases overall power consumption.

jasnell · 2022-10-02T10:22:42Z

I'm also skeptical here. While I appreciate the code change and the perf improvement for some cases, this definitely increases the complexity quite a bit for an operation that is often not in the hot path. I'm going to say -0 on this (slightly against but not going to block, but I'm not at all convinced there's enough value)

lucshi · 2022-10-14T08:16:56Z

Accepting changes is a tradeoff, that's mentioned elsewhere in doc/contributing/pull-requests.md.

I agree with @mscdex the benefits (performance) do not outweigh the downsides (complexity) for an uncommon operation. AVX-512 itself is still so uncommon it's probably not worth optimizing for yet.

Something I also touched upon in your other pull request is that I'm skeptical real world programs are going to see much improvement.

Most programs don't swap bytes 100% of the time and lightly sprinkling AVX-512 instructions throughout your instruction stream often reduces overall performance. It definitely increases overall power consumption.

Could you please provide a real test case? I will try to reproduce your skepital.

bnoordhuis · 2022-10-14T10:02:02Z

Some good candidates: tsc, webpack, ghost, express. You should check if they or their dependencies actually do any byte swapping.

(If none of them do, that's a good indicator it's not a hot path.)

lucshi · 2022-10-17T01:36:26Z

Some good candidates: tsc, webpack, ghost, express. You should check if they or their dependencies actually do any byte swapping.

(If none of them do, that's a good indicator it's not a hot path.)

some of them like typescript and express are frameworks, and programmers can write any JS code based on them. When I search for typescript code snippets in Github that invoke buffer.swap32(), I could see many occurances. Is that a sign of the popularity of buffer wapping operations?

Search link: https://github.com/search?l=TypeScript&q=buffer.swap32&type=Code

mscdex · 2022-10-17T05:35:13Z

I could see many occurances. Is that a sign of the popularity of buffer wapping operations?

Search link: https://github.com/search?l=TypeScript&q=buffer.swap32&type=Code

That link seems to show that the majority of results are just typescript definitions and not so much widespread code usage.

lucshi · 2022-10-17T05:55:27Z

It's a bit nosense that so many projects define functions and not use them. I will look into the project repo code listed in the first result page and trace the usage of those definitions.

jasnell · 2022-10-17T14:29:33Z

Closing as there's likely no action to take here.

src: enable SIMD support for buffer swap

224390a

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Sep 26, 2022

lucshi marked this pull request as ready for review September 26, 2022 08:24

tniessen added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2022

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2022

mscdex suggested changes Sep 26, 2022

View reviewed changes

This was referenced Sep 27, 2022

CI Reliability 2022-09-27 nodejs/reliability#386

Open

CI Reliability 2022-09-28 nodejs/reliability#387

Open

lucshi mentioned this pull request Oct 17, 2022

Enable SIMD for Buffer hex encoding #44999

Closed

jasnell closed this Oct 17, 2022

lucshi deleted the simd-swap branch December 4, 2023 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: enable SIMD support for buffer swap #44793

src: enable SIMD support for buffer swap #44793

lucshi commented Sep 26, 2022

lucshi commented Sep 26, 2022

nodejs-github-bot commented Sep 26, 2022

mscdex left a comment

lucshi commented Sep 26, 2022 •

edited

Loading

lucshi commented Sep 28, 2022 •

edited

Loading

bnoordhuis commented Sep 28, 2022

jasnell commented Oct 2, 2022

lucshi commented Oct 14, 2022

bnoordhuis commented Oct 14, 2022

lucshi commented Oct 17, 2022

mscdex commented Oct 17, 2022

lucshi commented Oct 17, 2022

jasnell commented Oct 17, 2022

src: enable SIMD support for buffer swap #44793

src: enable SIMD support for buffer swap #44793

Conversation

lucshi commented Sep 26, 2022

lucshi commented Sep 26, 2022

nodejs-github-bot commented Sep 26, 2022

mscdex left a comment

Choose a reason for hiding this comment

lucshi commented Sep 26, 2022 • edited Loading

lucshi commented Sep 28, 2022 • edited Loading

bnoordhuis commented Sep 28, 2022

jasnell commented Oct 2, 2022

lucshi commented Oct 14, 2022

bnoordhuis commented Oct 14, 2022

lucshi commented Oct 17, 2022

mscdex commented Oct 17, 2022

lucshi commented Oct 17, 2022

jasnell commented Oct 17, 2022

lucshi commented Sep 26, 2022 •

edited

Loading

lucshi commented Sep 28, 2022 •

edited

Loading