Added utility to dump ledger state from buckets to JSON #3447

dmkozh · 2022-06-07T18:54:15Z

Description

Added utility to dump ledger state from buckets to JSON for debugging (optionally using jq for post-processing).

The full ledger in this format currently takes ~19GB and is not really feasible for meaningful processing. Hence initially I introduce the following ways to reduce the output (all of which can be combined):

Filter expressions for conditionally selection the entries
Limit to only recently updated entries
Limit the output entry count

For the sake of filtering I introduced XDR field extraction utility, that can be used for other dynamic operations on XDR, e.g. group-by and/or reduction. However, for now I'm keeping this minimalistic until some concrete use-cases that aren't feasible with the current approach appear.

The current approach is a simple iteration over buckets from newer to older. This probably could be optimized (e.g. with parallel execution or XDR of intermediate results), but for now the performance seems acceptable: ~80s on my laptop for filtering the full ledger with a reasonably small output set.

Checklist

Reviewed the contributing document
Rebased on top of master (no merge commits)
Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
Compiles
Ran all tests
If change impacts performance, include supporting evidence per the performance document

… (optionally using `jq` for post-processing). The full ledger in this format currently takes ~19GB and is not really feasible for meaningful processing. Hence initially I introduce the following ways to reduce the output (all of which can be combined): - Filter expressions for conditionally selection the entries - Limit to only recently updated entries - Limit the output entry count For the sake of filtering I introduced XDR field extraction utility, that can be used for other dynamic operations on XDR, e.g. group-by and/or reduction. However, for now I'm keeping this minimalistic until some concrete use-cases that aren't feasible with the current approach appear. The current approach is a simple iteration over buckets from newer to older. This probably could be optimized (e.g. with parallel execution or XDR of intermediate results), but for now the performance seems acceptable: ~80s on my laptop for filtering the full ledger with a reasonably small output set.

MonsieurNicolas

this is great! Thank you

MonsieurNicolas · 2022-06-20T00:59:39Z

r+ b7b8638

MonsieurNicolas approved these changes Jun 20, 2022

View reviewed changes

latobarita merged commit ed1e6a3 into stellar:master Jun 20, 2022

graydon mentioned this pull request Jun 21, 2022

Fix failing non-test build due to test in tests/ subdir #3454

Merged

MonsieurNicolas mentioned this pull request Jun 24, 2022

Proposal: add basic subcommands to help query ledger data #3419

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added utility to dump ledger state from buckets to JSON #3447

Added utility to dump ledger state from buckets to JSON #3447

dmkozh commented Jun 7, 2022

MonsieurNicolas left a comment

MonsieurNicolas commented Jun 20, 2022

Added utility to dump ledger state from buckets to JSON #3447

Added utility to dump ledger state from buckets to JSON #3447

Conversation

dmkozh commented Jun 7, 2022

Description

Checklist

MonsieurNicolas left a comment

Choose a reason for hiding this comment

MonsieurNicolas commented Jun 20, 2022