Node from stream backrefs optimisation #532

matt-o-how · 2025-01-13T16:09:52Z

Use a Vec<NodePtr> stack instead of NodePtr / SExps in node_from_stream_backrefs and add a new traverse_path_with_vec() function to handle backrefs

coveralls-official · 2025-01-13T16:19:03Z

Pull Request Test Coverage Report for Build 13266025427

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

238 of 238 (100.0%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.3%) to 94.141%

Totals
Change from base Build 12933937517:	0.3%
Covered Lines:	6298
Relevant Lines:	6690

💛 - Coveralls

arvidn

it looks correct, as far as I can tell. I think we need tests for all interesting cases, to make sure it works. I'm also interested in seeing a benchmark. Does this make a difference? I would expect it to at least use less memory, which typically means faster on small machines (like Raspberry PI)

src/traverse_path.rs

src/serde/de_br.rs

src/traverse_path.rs

arvidn

I think these things are still needed:

preserve the existing function, partly to control when we switch over to the new one, and also to be able to test that both behave the same
ensure the new function produce the same result as the old one, e.g. with a fuzzer.
ensure the new function behave the same with regards to limits to the number of pairs created by Allocator. It can be tested in a fuzzer by building with the counters build feature
benchmark to demonstrate that this is an improvement (this should probably be done early, as we might want to scrap this idea if it doesn't carry its weight)
survey the mainnet and testnet blockchains to see if back references into the parse-stack eveer exists in the wild
unit tests for all edge cases

src/serde/de_br.rs

.github/workflows/build-test.yml

fuzz/Cargo.toml

fuzz/fuzz_targets/deserialize_br.rs

src/serde/de_br.rs

fuzz/fuzz_targets/deserialize_br.rs

fuzz/fuzz_targets/deserialize_br_test_pair_count.rs

src/allocator.rs

src/serde/de_br.rs

arvidn · 2025-02-11T12:46:05Z

These are the benchmark results I get on raspberry Pi 5:

deserialize/node_from_bytes_backrefs-compressed
                        time:   [352.36 µs 358.58 µs 367.28 µs]
deserialize/node_from_bytes_backrefs_old-compressed
                        time:   [344.03 µs 350.53 µs 358.28 µs]

So, a slight performance degradation. I would have expected it to be a slight improvement on a system with small caches. But presumably the memory savings is worth it. Do we have a measurement on that?

src/serde/de_br.rs

matt-o-how requested a review from arvidn January 13, 2025 16:18

arvidn reviewed Jan 13, 2025

View reviewed changes

arvidn reviewed Jan 15, 2025

View reviewed changes

matt-o-how force-pushed the node_from_stream_backrefs_optimisation branch from 166b35f to cb47c16 Compare January 17, 2025 10:09

matt-o-how force-pushed the node_from_stream_backrefs_optimisation branch from cb47c16 to 17f7c09 Compare January 27, 2025 16:53

arvidn reviewed Jan 29, 2025

View reviewed changes

matt-o-how added 23 commits January 30, 2025 12:23

initial commit

438e2df

basic structure

fdeef9f

comment for clarity

bdeea87

Improve clarity in error messages and comments

a9b3351

passing tests!

47e41b8

clippy cleanups

5d38c06

clarify comment

99b0c1d

remove Cost and move into de_br

ddceeba

use an index to simulate stack rather than cloning vec

8060a47

add catch for underflow

d394e0b

catch another underflow

f8f1e08

add fuzz against old

de2b748

fmt

3d35f09

add pair count fuzz

5231929

fmt fuzz

791030b

try adding features to fuzzer cargo.toml

675ab2c

forward the counters feature

982498f

fix name in fuzz cargo.toml

3b86c79

add all features to cargo fuzz github action

423dd73

return error when traversing empty values stack

64340f1

special case 0xfe 0x01

bd93d2a

special case 0xfe 0x00

49e26e2

treat empty stack as sexp

25cf07c

matt-o-how added 11 commits February 6, 2025 13:22

Add comments to ghost pair funcs

250cc15

counter includes ghost pairs in pairs, fuzzer checks for equality

9f81c62

rename ghost_pair functions for clarity

573b071

add debug output message if fuzzer fails

a9afbc6

add test which triggers counters discrepancy

da02ba0

fix checkpointing for ghost_pair counting

1fb0a47

remove commented out optimisation

fa4523b

address arvids comments about allocator

9a43216

fmt

cbd3d01

fix allocation amounts

824644f

add unit test for ghost pair limit

2065df1

arvidn reviewed Feb 10, 2025

View reviewed changes

matt-o-how added 9 commits February 10, 2025 14:44

remove allocator.max_num_atoms and add comment for num_ghost_pairs

140e908

add additional test case

ad09f81

reformat fuzzer

f2571b3

fmt fuzz

494f536

fix fuzz

b56cb0f

don't compare errors as they don't support ==

647ee8a

fmt fuzz again

a4818c4

underscore unused vars

c197b2b

remove fuzz target for deserialization pair count

e896a8c

add cache related tests for traverse_path_with_vec

144b4dd

arvidn reviewed Feb 11, 2025

View reviewed changes

src/serde/de_br.rs Outdated Show resolved Hide resolved

arvidn reviewed Feb 11, 2025

View reviewed changes

src/serde/de_br.rs Show resolved Hide resolved

address arvid's comments

76bc1b4

arvidn approved these changes Feb 11, 2025

View reviewed changes

arvidn merged commit ee71e95 into main Feb 11, 2025
28 checks passed

arvidn deleted the node_from_stream_backrefs_optimisation branch February 11, 2025 15:53

arvidn mentioned this pull request Feb 25, 2025

optimize compressed CLVM serialization #562

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node from stream backrefs optimisation #532

Node from stream backrefs optimisation #532

matt-o-how commented Jan 13, 2025

coveralls-official bot commented Jan 13, 2025 •

edited

Loading

arvidn left a comment

arvidn left a comment •

edited by matt-o-how

Loading

arvidn commented Feb 11, 2025

Node from stream backrefs optimisation #532

Node from stream backrefs optimisation #532

Conversation

matt-o-how commented Jan 13, 2025

coveralls-official bot commented Jan 13, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13266025427

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

arvidn left a comment

Choose a reason for hiding this comment

arvidn left a comment • edited by matt-o-how Loading

Choose a reason for hiding this comment

arvidn commented Feb 11, 2025

coveralls-official bot commented Jan 13, 2025 •

edited

Loading

arvidn left a comment •

edited by matt-o-how

Loading