You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We proposed a solution to avoid the frequent "out of disk space" errors in #4753. This issue specifies it further.
The goal is to enable users to specify chain and state retention policies in their $LOTUS_HOME/config.toml file.
It could look like this:
[Blockstore]
RetainChain = 10000# prune chain objects that went off scope 10000 tipsets agoRetainState = 100# prune state objects that went off scope 10000 tipsets ago
Design
Expunge process and QoS
We introduce an async, background expunge process that operates entirely on the chain and state cold stores.
It takes care of applying these policies in a best-effort fashion.
It's best effort because it needs to be throttled, as it's generally not considered critical and thus should not steal CPU nor IO bandwidth from critical processes.
The throttling rate could vary depending on disk usage (the less disk space we have available, the more compute / IO we allocate to this process, to avoid collapse).
Requirements
To make this process efficient, we need to record metadata in the cold store, alongside the block. This can be done by native means (e.g. Badger has a Metadata field which is stored next to the key in the indices), or by wrapping the block in a metadata container.
The expunge process would then iterate over the cold store and apply the retention policy by deleting objects that exceed the threshold.
The first run would be a special one.
If this is released alongside the splitstore, none of the objects would carry metadata (since the splitstore turns the current store into the cold store).
But as objects archived by the splitstore start making it into the cold store, we would start populating that metadata.
If the split store "pulls" objects onto the hot store, we can assume that unmarked objects went out of scope before the lowest epoch written in metadata. There is no risk of unmarked active objects.
We can achieve the above by making the splitstore stop the world on first initialisation and run a single state tree walk to copy actual hot objects into the hot store.
The text was updated successfully, but these errors were encountered:
If this process is automatic, it would be incredible not just for space considerations, but also memory usage. I do fear the additional metadata storage might create more I/O and/or memory usage, which would make the memory issue worse.
I also wonder about the QoS notion -- I could see a situation where the daemon is overwhelmed due to large blockchain, and because it's trying to handle the stress, it can't run the pruning process which would alleviate the stress.
For expunging the coldstore, I’m planning to use index metadata to record the “last reachable” epoch for each block in the state tree.
This is available on badger (WithMeta()) and I think we can make this feature available on gonudb (cc @iand).
This would allow the expunge process to iterate over all keys in the cold store (rate limited, so as not to affect performance) and delete the keys that have fallen outside the retention policy.
Of course, a challenge is performing the actual physical deletion:
in badger we need to call Flatten(), so that tombstoned keys are deleted.
nudb doesn’t support deletions AFAIK, not sure how we could implement them in gonudb. Alternatively we could rotate the coldstore every time, but that would increase the disk space requirements.
Goal
We proposed a solution to avoid the frequent "out of disk space" errors in #4753. This issue specifies it further.
The goal is to enable users to specify chain and state retention policies in their
$LOTUS_HOME/config.toml
file.It could look like this:
Design
Expunge process and QoS
Requirements
The text was updated successfully, but these errors were encountered: