feat: make Compressor::train 2x faster with bitmap index #16

a10y · 2024-08-20T20:49:30Z

The slowest part of Compressor::train is the double-nested loops over codes.

Now compress_count when it records code pairs will also populate a bitmap index, where pairs_index[code1].set(code2) will indicate that code2 followed code1 in compressed output.

In the optimize loop, we can eliminate tight loop iterations by accessing pairse_index[code1].second_codes() which yields the value code2 values.

This results in a speedup from ~1ms -> 500micros for the training benchmark. We're sub-millisecond!

This also makes Miri somewhat palatable to run for all but test_large, so I've re-enabled it for CI (currently it runs in 2.5 minutes. Far cry from the < 30s build+test step but I guess it's for a good cause)

The slowest part of Compressor::train is the double-nested loops over codes. Now compress_count when it records code pairs will also populate a bitmap index, where `pairs_index[code1].set(code2)` will indicate that code2 followed code1 in compressed output. In the `optimize` loop, we can eliminate tight loop iterations by accessing `pairse_index[code1].second_codes()` which yields the value code2 values. This results in a speedup from ~1ms -> 500micros.

a10y · 2024-08-20T20:50:56Z

src/builder.rs

-    pub fn reset(&mut self) {
-        for idx in 0..COUNTS1_SIZE {
-            self.counts1[idx] = 0;
-        }
-        for idx in 0..COUNTS2_SIZE {
-            self.counts2[idx] = 0;


this was slower than just building a new Counter b/c of the vec![0] change made in the previous PR

i don't want to lose my 30s CI checks

## 🤖 New release * `fsst-rs`: 0.2.0 -> 0.2.1 <details><summary>Changelog</summary> <blockquote> ## [0.2.1](v0.2.0...v0.2.1) - 2024-08-20 ### Added - make Compressor::train 2x faster with bitmap index ([#16](#16)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/MarcoIeni/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

AdamGS · 2024-08-21T09:12:31Z

src/builder.rs

+        if self.block == 0 {
+            return None;
+        }


Shouldn't it be possible to skip this check?

Good catch! #18

a10y added 2 commits August 20, 2024 16:44

add miri action

720d506

a10y commented Aug 20, 2024

View reviewed changes

a10y added 4 commits August 20, 2024 16:54

final cleanups

fecde16

only run miri on develop

17ac1be

i don't want to lose my 30s CI checks

turn miri back on for CI

aea4ae3

fix small bug in iterator, more tests

f74d185

a10y force-pushed the aduffy/train-speedup branch from 64beee7 to f74d185 Compare August 20, 2024 21:58

a10y merged commit d7e836c into develop Aug 20, 2024
3 checks passed

a10y deleted the aduffy/train-speedup branch August 20, 2024 22:04

github-actions bot mentioned this pull request Aug 20, 2024

chore: release v0.2.1 #17

Merged

a10y mentioned this pull request Aug 20, 2024

Miri is incredibly slow #14

Closed

AdamGS reviewed Aug 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make Compressor::train 2x faster with bitmap index #16

feat: make Compressor::train 2x faster with bitmap index #16

a10y commented Aug 20, 2024 •

edited

Loading

a10y Aug 20, 2024

AdamGS Aug 21, 2024

a10y Aug 21, 2024

feat: make Compressor::train 2x faster with bitmap index #16

feat: make Compressor::train 2x faster with bitmap index #16

Conversation

a10y commented Aug 20, 2024 • edited Loading

a10y Aug 20, 2024

Choose a reason for hiding this comment

AdamGS Aug 21, 2024

Choose a reason for hiding this comment

a10y Aug 21, 2024

Choose a reason for hiding this comment

a10y commented Aug 20, 2024 •

edited

Loading