-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle merge of already serialized value-keys #95
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #95 +/- ##
==========================================
- Coverage 63.57% 63.47% -0.11%
==========================================
Files 15 15
Lines 2740 2779 +39
==========================================
+ Hits 1742 1764 +22
- Misses 742 753 +11
- Partials 256 262 +6
|
Pebble documentation suggests that `Merger` is used by `.Merge` calls on a given key value. However, Pebble uses merger on iteration over values too. Internally, `Get` operations use iterators for example. This means values that are already serialized may be passed to the merger implementation during in-memory traversal. The previous implementation of value-keys merger did not take this into account and as a result it meant that `Get` failed to find value-keys associated to a multihash since they ended up being double wrapped by the merger value-keys marshaller. The changes here fixes this issue by checking explicitly that a give value-key has the expected key prefix before proceeding with the remaining merge logic. As a precaution, the merger is also modified to fall back on the default Pebble merger when the key is not of kind multihash-key. Further, the benchmarks are modified to explicitly assert that the inserted values are indeed found for each entry. The `Get` implementation in Pebble store is also modified to copy the content returned by the backing store to avoid slice invalidation and runtime invalid reference errors captured in #80. Unrelated to the lookup issue, the changes here also fix typos in benchmarks where workload does not match the benchmark name. Further, the common tests are updated to do a flush as suggested in #93. Fixes #94, #80.
2083f1b
to
d5b60db
Compare
For reference here is a local run of $ go test ./bench -bench='(Pebble|Storethehash)(Put|Get)_W(0|1)' -count=3 -timeout=300m
cpu: Intel(R) Core(TM) i7 CPU @ 2.30GHz
PASS
ok github.com/filecoin-project/go-indexer-core/bench 1416.652s
name time/op
Store_PebblePut_W0-8 1.78s ±12%
Store_PebbleGet_W0-8 9.03s ± 1%
Store_PebblePut_W1-8 21.4s ± 6%
Store_PebbleGet_W1-8 135s ± 3%
Store_StorethehashPut_W0-8 3.38s ± 4%
Store_StorethehashGet_W0-8 9.82s ± 4%
Store_StorethehashPut_W1-8 55.9s ± 1%
Store_StorethehashGet_W1-8 141s ± 9%
name speed
Store_PebblePut_W0-8 19.3MB/s ±11%
Store_PebbleGet_W0-8 3.77MB/s ± 1%
Store_PebblePut_W1-8 15.9MB/s ± 6%
Store_PebbleGet_W1-8 2.53MB/s ± 3%
Store_StorethehashPut_W0-8 10.1MB/s ± 4%
Store_StorethehashGet_W0-8 3.47MB/s ± 4%
Store_StorethehashPut_W1-8 6.10MB/s ± 1%
Store_StorethehashGet_W1-8 2.43MB/s ± 9%
name size
Store_PebblePut_W0-8 48.4M ± 0%
Store_PebbleGet_W0-8 45.5M ± 0%
Store_PebblePut_W1-8 501M ± 0%
Store_PebbleGet_W1-8 487M ± 0%
Store_StorethehashPut_W0-8 87.4M ± 0%
Store_StorethehashGet_W0-8 87.4M ± 0%
Store_StorethehashPut_W1-8 912M ± 0%
Store_StorethehashGet_W1-8 912M ± 0%
name alloc/op
Store_PebblePut_W0-8 155MB ± 1%
Store_PebbleGet_W0-8 19.6GB ± 0%
Store_PebblePut_W1-8 4.51GB ± 2%
Store_PebbleGet_W1-8 198GB ± 0%
Store_StorethehashPut_W0-8 967MB ± 0%
Store_StorethehashGet_W0-8 1.61GB ± 5%
Store_StorethehashPut_W1-8 11.1GB ± 1%
Store_StorethehashGet_W1-8 17.8GB ± 0%
name allocs/op
Store_PebblePut_W0-8 2.55M ± 2%
Store_PebbleGet_W0-8 29.7M ± 0%
Store_PebblePut_W1-8 97.9M ± 2%
Store_PebbleGet_W1-8 351M ± 1%
Store_StorethehashPut_W0-8 11.9M ± 1%
Store_StorethehashGet_W0-8 28.1M ± 5%
Store_StorethehashPut_W1-8 142M ± 0%
Store_StorethehashGet_W1-8 309M ± 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment, otherwise LGTM.
I have made the changes suggested in the comment here for your review:
521d8bc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Make a copy of given byte slices to assure they remain valid and are unaffected by slice invalidation.
521d8bc
to
2e1c674
Compare
The previous behaviour of value-key merger in certain scenarios may have been resulted in value-keys being marshalled repeatedly. This was the underlying cause of the issue captured in #94 and corrected in #95. The work here adds to the corrections by gracefully fixing such records by unmarshalling them recursively until we reach values with the expected key prefix, and de-duplicating them as necessary. The fix is done opportunistically, in that whenever such values are detected during reads and writes they are fixed. Additional tests are added to simulate nested marshalling of value-keys and assert that the resulting merge is as expected for both merge-older and merge-newer cases.
The previous behaviour of value-key merger in certain scenarios may have been resulted in value-keys being marshalled repeatedly. This was the underlying cause of the issue captured in #94 and corrected in #95. The work here adds to the corrections by gracefully fixing such records by unmarshalling them recursively until we reach values with the expected key prefix, and de-duplicating them as necessary. The fix is done opportunistically, in that whenever such values are detected during reads and writes they are fixed. Additional tests are added to simulate nested marshalling of value-keys and assert that the resulting merge is as expected for both merge-older and merge-newer cases.
Pebble documentation suggests that
Merger
is used by.Merge
calls on a given key value. However, Pebble uses merger on iteration over values too. Internally,Get
operations use iterators for example. This means values that are already serialized may be passed to the merger implementation during in-memory traversal.The previous implementation of value-keys merger did not take this into account and as a result it meant that
Get
failed to find value-keys associated to a multihash since they ended up being double wrapped by the merger value-keys marshaller.The changes here fixes this issue by checking explicitly that a give value-key has the expected key prefix before proceeding with the remaining merge logic. As a precaution, the merger is also modified to fall back on the default Pebble merger when the key is not of kind multihash-key. Further, the benchmarks are modified to explicitly assert that the inserted values are indeed found for each entry.
The
Get
implementation in Pebble store is also modified to copy the content returned by the backing store to avoid slice invalidation and runtime invalid reference errors captured in #80.Unrelated to the lookup issue, the changes here also fix typos in benchmarks where workload does not match the benchmark name. Further, the common tests are updated to do a flush as suggested in #93.
Fixes #94, #80.