Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove bstr and reimplement bits that are used in decode #138

Merged
merged 1 commit into from
Mar 28, 2022

Conversation

lopopolo
Copy link
Member

The only remaining use of bstr in boba was a single call to
ByteSlice::find_not_byteset in boba::decode::inner. After looking at
the bstr source, for an alphabet the size of bubblebabble's encoding,
it computed a [u8; 256] lookup table and checks for membership with
O(1) array access.

Because we know the alphabet ahead of time, we can pre-compute the table
and implement this with Iterator::find.

This results in a modest 3% speedup in decode performance, mostly because
the alphabet lookup table is a pre-computed const.

boba now has no dependencies outside of alloc and std.

Benches

boba::encode/empty      time:   [46.845 ns 47.052 ns 47.277 ns]
                        change: [-0.0306% +1.5149% +3.1697%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
boba::encode/1234567890 time:   [90.846 ns 91.399 ns 91.988 ns]
                        change: [+0.2467% +2.0851% +4.0831%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
boba::encode/Pineapple  time:   [87.152 ns 87.562 ns 87.976 ns]
                        change: [+0.2536% +2.1208% +4.0694%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
boba::encode/emoji      time:   [141.63 ns 142.15 ns 142.74 ns]
                        change: [-1.3811% +0.4342% +2.2508%] (p = 0.65 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe

boba::decode/empty      time:   [2.6051 ns 2.6243 ns 2.6540 ns]
                        change: [-2.0287% -0.3042% +1.3373%] (p = 0.75 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe
boba::decode/1234567890 time:   [104.86 ns 105.34 ns 105.86 ns]
                        change: [-3.7604% -2.5821% -1.2118%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) high mild
  9 (9.00%) high severe
boba::decode/Pineapple  time:   [100.09 ns 100.78 ns 101.54 ns]
                        change: [-4.5410% -2.9169% -1.1301%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe
boba::decode/emoji      time:   [161.38 ns 161.98 ns 162.68 ns]
                        change: [-4.5053% -3.2074% -1.6301%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

The only remaining use of `bstr` in `boba` was a single call to
`ByteSlice::find_not_byteset` in `boba::decode::inner`. After looking at
the `bstr` source, for an alphabet the size of bubblebabble's encoding,
it computed a `[u8; 256]` lookup table and checks for membership with
`O(1)` array access.

Because we know the alphabet ahead of time, we can pre-compute the table
and implement this with `Iterator::find`.

This results in a modest 3% speedup in decode performance, mostly because
the alphabet lookup table is a pre-computed const.
@lopopolo lopopolo added A-deps Area: Source and library dependencies. A-performance Area: Performance improvements and optimizations. A-decode Area: Core decoder implementation. labels Mar 28, 2022
@lopopolo lopopolo merged commit 75b0b6b into trunk Mar 28, 2022
@lopopolo lopopolo deleted the lopopolo/remove-bstr branch March 28, 2022 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-decode Area: Core decoder implementation. A-deps Area: Source and library dependencies. A-performance Area: Performance improvements and optimizations.
Development

Successfully merging this pull request may close these issues.

1 participant