buffer: optimize decoding wrapped base64 data #12146

aqrln · 2017-03-31T14:17:54Z

The fast base64 decoder used to switch to the slow one permanently when
it saw a whitespace or other garbage character. Since the most common
situation such characters may be encountered in is line-wrapped base64
data, a more profitable strategy is to decode a single 24-bit group with
the slow decoder and then continue running the fast algorithm.

Refs: #12114

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
commit message follows commit guidelines

Affected core subsystem(s)

buffer

aqrln · 2017-03-31T14:21:25Z

I haven't run the whole benchmark suite yet, only symlinked the exiting and the new benchmarks into a new directory locally:

                                             improvement confidence      p.value
 base64/buffer-base64-decode-wrapped.js n=32     11.04 %        *** 4.413651e-27
 base64/buffer-base64-decode.js n=32             -1.49 %            5.613341e-02

Fishrock123 · 2017-03-31T14:55:07Z

CI: https://ci.nodejs.org/job/node-test-pull-request/7140/

jasnell · 2017-03-31T15:45:19Z

benchmark/buffers/buffer-base64-decode-wrapped.js

+  // eslint-disable-next-line no-unescaped-regexp-dot
+  data.match(/./);  // Flatten the string
+  const buffer = Buffer.allocUnsafe(bytesCount);
+  buffer.write(data, 0, bytesCount, 'base64');


minor nit but this can be simplified just a bit by doing...

const line = 'abcd'.repeat(charsPerLine / 4) + '\n'; const buffer = Buffer.alloc(bytesCount, line);

Done. Thanks for the suggestion!

The fast base64 decoder used to switch to the slow one permanently when it saw a whitespace or other garbage character. Since the most common situation such characters may be encountered in is line-wrapped base64 data, a more profitable strategy is to decode a single 24-bit group with the slow decoder and then continue running the fast algorithm. Refs: nodejs#12114

aqrln · 2017-03-31T20:29:34Z

Rebased to incorporate changes from #11995.

addaleax

Thank you!

aqrln · 2017-04-04T12:44:37Z

Can I get a fresh CI run?

trevnorris · 2017-04-04T13:30:49Z

CI: https://ci.nodejs.org/job/node-test-pull-request/7192/

trevnorris

not sure I'm qualified to sign off on this but LGTM

aqrln · 2017-04-04T14:04:48Z

/cc @bnoordhuis (based on Git history)

jasnell

This LGTM but it would be really nice to have some cctests for this header file

jasnell · 2017-04-04T16:42:15Z

(to be clear, the cctest can be added separately :-) ...)

The fast base64 decoder used to switch to the slow one permanently when it saw a whitespace or other garbage character. Since the most common situation such characters may be encountered in is line-wrapped base64 data, a more profitable strategy is to decode a single 24-bit group with the slow decoder and then continue running the fast algorithm. PR-URL: #12146 Ref: #12114 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Trevor Norris <[email protected]> Reviewed-By: James M Snell <[email protected]>

jasnell · 2017-04-04T16:45:48Z

Landed in e77a83f

This commit adds C++ tests for `base64_encode()` and `base64_decode()` functions defined in `base64.h`. The functionality is already being tested indirectly in JavaScript tests for Buffer, but it won't hurt to test the low-level functions too, especially given that they aren't only used in the internal Buffer implementation, Chrome inspector protocol support relies upon them too. Refs: nodejs#12146 (comment)

This commit adds C++ tests for `base64_encode()` and `base64_decode()` functions defined in `base64.h`. The functionality is already being tested indirectly in JavaScript tests for Buffer, but it won't hurt to test the low-level functions too, especially given that they aren't only used in the internal Buffer implementation, Chrome inspector protocol support relies upon them too. PR-URL: #12238 Refs: #12146 (comment) Reviewed-By: James M Snell <[email protected]> Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Daniel Bevenius <[email protected]>

The fast base64 decoder used to switch to the slow one permanently when it saw a whitespace or other garbage character. Since the most common situation such characters may be encountered in is line-wrapped base64 data, a more profitable strategy is to decode a single 24-bit group with the slow decoder and then continue running the fast algorithm. PR-URL: nodejs#12146 Ref: nodejs#12114 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Trevor Norris <[email protected]> Reviewed-By: James M Snell <[email protected]>

MylesBorins · 2017-04-18T23:34:44Z

should we backport?

Assuming if so we should wait a bit

addaleax · 2017-06-13T13:22:02Z

should we backport?

It seems to be the root cause of #13657, so no.

This commit adds C++ tests for `base64_encode()` and `base64_decode()` functions defined in `base64.h`. The functionality is already being tested indirectly in JavaScript tests for Buffer, but it won't hurt to test the low-level functions too, especially given that they aren't only used in the internal Buffer implementation, Chrome inspector protocol support relies upon them too. PR-URL: #12238 Refs: #12146 (comment) Reviewed-By: James M Snell <[email protected]> Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Daniel Bevenius <[email protected]>

nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. labels Mar 31, 2017

This was referenced Mar 31, 2017

src: use const parameters in base64_decode_slow() #12144

Closed

base64 decoder could be 2x faster when decoding wrapped base64 #12114

Closed

jasnell requested review from trevnorris and addaleax March 31, 2017 15:39

jasnell reviewed Mar 31, 2017

View reviewed changes

aqrln force-pushed the base64-optimization branch from 1fb578c to ff77c72 Compare March 31, 2017 20:27

addaleax approved these changes Mar 31, 2017

View reviewed changes

trevnorris approved these changes Apr 4, 2017

View reviewed changes

jasnell approved these changes Apr 4, 2017

View reviewed changes

jasnell closed this Apr 4, 2017

aqrln deleted the base64-optimization branch April 4, 2017 16:46

jasnell mentioned this pull request Apr 4, 2017

8.0.0 Release Proposal #12220

Closed

aqrln mentioned this pull request Apr 5, 2017

test: add basic cctest for base64.h #12238

Closed

3 tasks

italoacasas mentioned this pull request Apr 10, 2017

v7.9.0 Release Proposal #12319

Merged

2 tasks

MylesBorins added the lts-watch-v6.x label Apr 18, 2017

MylesBorins added the baking-for-lts PRs that need to wait before landing in a LTS release. label May 15, 2017

addaleax added dont-land-on-v4.x and removed baking-for-lts PRs that need to wait before landing in a LTS release. lts-watch-v6.x labels Jun 13, 2017

aqrln mentioned this pull request Jul 21, 2017

src: fix decoding base64 with whitespace #13660

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: optimize decoding wrapped base64 data #12146

buffer: optimize decoding wrapped base64 data #12146

aqrln commented Mar 31, 2017 •

edited

Loading

aqrln commented Mar 31, 2017

Fishrock123 commented Mar 31, 2017

jasnell Mar 31, 2017

aqrln Mar 31, 2017

aqrln commented Mar 31, 2017

addaleax left a comment

aqrln commented Apr 4, 2017

trevnorris commented Apr 4, 2017

trevnorris left a comment

aqrln commented Apr 4, 2017 •

edited

Loading

jasnell left a comment

jasnell commented Apr 4, 2017

jasnell commented Apr 4, 2017

MylesBorins commented Apr 18, 2017

addaleax commented Jun 13, 2017

buffer: optimize decoding wrapped base64 data #12146

buffer: optimize decoding wrapped base64 data #12146

Conversation

aqrln commented Mar 31, 2017 • edited Loading

Checklist

Affected core subsystem(s)

aqrln commented Mar 31, 2017

Fishrock123 commented Mar 31, 2017

jasnell Mar 31, 2017

Choose a reason for hiding this comment

aqrln Mar 31, 2017

Choose a reason for hiding this comment

aqrln commented Mar 31, 2017

addaleax left a comment

Choose a reason for hiding this comment

aqrln commented Apr 4, 2017

trevnorris commented Apr 4, 2017

trevnorris left a comment

Choose a reason for hiding this comment

aqrln commented Apr 4, 2017 • edited Loading

jasnell left a comment

Choose a reason for hiding this comment

jasnell commented Apr 4, 2017

jasnell commented Apr 4, 2017

MylesBorins commented Apr 18, 2017

addaleax commented Jun 13, 2017

aqrln commented Mar 31, 2017 •

edited

Loading

aqrln commented Apr 4, 2017 •

edited

Loading