Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Signature duplicate check with checkpointing is slow #2497

Closed
aeyakovenko opened this issue Jan 20, 2019 · 7 comments
Closed

Signature duplicate check with checkpointing is slow #2497

aeyakovenko opened this issue Jan 20, 2019 · 7 comments

Comments

@aeyakovenko
Copy link
Member

aeyakovenko commented Jan 20, 2019

Problem

Even with bloom filters, #2289, the signature filtering needs to traverse a bunch of checkpoints per signature to check if its present or not. It looks like a single bloom filter check is going to be on the order of 1us, so 32 checks is going to be to slow for our target tps (although they can be run in parallel).

Proposed Solution

type ForkId = u64;
//probably a small amount of entries in this map, so a vec might be better
type ForkStatusMap = HashMap<ForkId, Result<()>>;
type SignatureMap = HashMap<Signature, ForkStatusMap>;
//the LastId is also from a specific fork.  But any descendants could have signatures for it.
type StatusCache = HashMap<LastId, (ForkId, SignatureMap)>;
  • Use a single LastId delimited map that keeps track of which fork the signature had been executed in. This data structure doesn't checkpoint, or merge, and is a single data structure for all the forks in the system.

  • When checking for dup signatures in the current fork, find the results in the above data structure for all the forks a signature has been processed. Check if the forks are an ancestor of the current fork.

  • Garbage collect as LastIds expire, since they expire globally for all the forks.

  • At a specific tick count, multiple LastIds would expire, because the values are different in each fork.

  • Some interesting optimizations,

//bitfield if the signature is present
//offset in the field is number of forks above the LastIds ForkId
type ForkStatusMap = u32;

and using 80 bits of the Signature instead of the full 256.
tag: @garious @rob-solana @sakridge

@aeyakovenko aeyakovenko changed the title Signature filtering with checkpointing is likely to slow Signature filtering with checkpointing is slow Jan 20, 2019
@aeyakovenko aeyakovenko changed the title Signature filtering with checkpointing is slow Signature duplicate check with checkpointing is slow Jan 20, 2019
@rob-solana
Copy link
Contributor

so lastids will expire with forks instead of with ticks?

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Jan 21, 2019

They still expire with ticks. Ticks overlap fork lifetimes because the counter moves forward in every fork.

So at a specific tick count, multiple LastIds would expire, because the values are different in each fork.

@aeyakovenko
Copy link
Member Author

@rob-solana
Copy link
Contributor

rob-solana commented Mar 27, 2019

I'm having trouble with how keeping track of which fork a sig was observed in helps. Is that a memory optimization?

Once a status_cache has been parented, it becomes read-only and can be shared across all forks that include the parent bank.

Garbage collection is part of squash() currently.

@aeyakovenko
Copy link
Member Author

@rob-solana, it helps with lookups. Since there is just hashmap lookup, vs a bunch

@rob-solana
Copy link
Contributor

rob-solana commented Mar 27, 2019

are SignatureMaps shared across Forks? Arc<RwLock> ?

if so, how do they come to be shared?

if not, how can a ForkStatusMap have more than one entry?

@aeyakovenko
Copy link
Member Author

@rob-solana it is just a global data structure for all the banks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants