v7.1.0
Description
This release focuses on various stability improvements. Furthermore, it has a change to the epoch event. The summary follows:
Block.Header.AppHash Crash Mitigation
Several data races have been fixed to mitigate the impact of the app hash bug. Some of them were related to concurrently committing and querying a node which explains relayers having the highest impact. Due to the non-deterministic nature of this bug, there is no guarantee that the issue has been resolved completely. However, no app hash bug has been observed with the new changes so far. We would appreciate any feedback on this from people running nodes in production. Additionally, more logs were added to help with debugging if the app hash bug does happen again.
Cache Reconfiguration to Reduce RAM Usage
An additional fast storage cache was introduced in v6.3.0
. Its size was determined by iavl-cache-size
in app.toml
. However, the default value was simply too large for the new cache. As a result, it was causing the cache to grow as nodes stayed online, exhausting memory. The fast node cache size is now separated from the iavl-cache-size config and lowered in the v7.1.0
release. Currently, it is non-configurable but that should become possible in a subsequent release.
Pruning and Snapshot Settings
Fast storage change introduced overhead related to taking snapshots with rigorous pruning settings. It had become possible to attempt to prune a height that is being snapshotted. As a result, a node would fail with this error. To temporarily mitigate, we required node operators to have unnecessarily large pruning-keep-recent
. With this change, this requirement is no longer needed. Snapshots may work on pruning-everything
or rigorous custom pruning.
Changes to pruning config in app.toml
:
pruning-keep-every
was removed and no longer exists (This change is done in a future SDK release as well)- The most rigorous pruning settings are
pruning = "everything"
. It is equivalent topruning = "custom"
withpruning-keep-recent = "10"
andpruning-interval = "10"
- If more rigorous than the mentioned-above pruning settings are selected, the node will fail to start. In such a case, an error is printed, guiding to select the appropriate configuration.
Epoch Event Change
The AfterEpochEnd
event was being called with an epoch number that is off by one, so anyone parsing epoch data off of that event will need to correct the off by one error. See #830 for more details.