You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an indexer is created, two processes are triggered:
real-time - starts from the block the indexer was registered X, and will execute the indexer function on every matching block from there on
historical - starts from the configured start_from_block height, and will execute the indexer function for all matching blocks up until X
For historical, this can be broken down in to two parts: indexed, and unindexed blocks. These may not be the most suitable names, but it is what they are called in code.
Unindexed Indexed Blocks come from the near-delta-lake bucket in S3. This bucket is populated via a DataBricks job which streams blocks from NEAR Lake, and for every account, stores the block heights which contain transactions made against them. This data allows us to quickly fetch a list of block heights which match the contract ID defined on the Indexer, rather than doing filtering through all blocks ourselves.
NEAR Delta Lake is not updated in real time, so for the historical process to close the gap between it and the starting point of the real-time process, it must also manually process the remaining blocks. This is the 'unindexed' portion of the backfill.
This will slightly change with the introduction of the control plane work. Rather than having two separate real-time and historical processes which run concurrently, we will have a single sequential process. It is essentially the 'Historical' process, expect that the 'unindexed' portion does not stop, and it continues indefinitely.
related to #395
Explain the historical backfill process of QueryAPI
The text was updated successfully, but these errors were encountered: