Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(store-sync): Handle unsynced RPC nodes #2901

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dhvanipa
Copy link
Contributor

@dhvanipa dhvanipa commented Jun 13, 2024

Context: https://indexsupply.com/shovel/docs/#unsynchronized-ethereum-nodes

  • 3rd party Ethereum API providers may load balance RPC requests across a set of unsynchronized nodes. In rare cases, the eth_getLogs request will be routed to a node that doesn’t have the latest block.
  • This issue has been noticed multiple times on Redstone

Solution: Batch fetch the latest block number with the logs, and ensure that the latest block number is past the toBlock. Otherwise, wait for the RPC to catch up to the latest block.

  • This fix relies on the client making a batch call to the RPC, which needs to be enabled by clients in their Viem HTTP transport.

For the sync stack, the fromBlock is decided from a local variable and the toBlock is decided by a subscription to the latest block number. So we need an additional filter from the latest block number subscription to not emit when the toBlock is behind the fromBlock, which happens with unsynchronized RPCs.

  • There is currently silent failure when this happens as when toBlock is less than fromBlock, no error is thrown, and no logs are fetched either, as this while loop will resolve false. So in this PR, we add a filter to not emit incorrect toBlock's and also a safety error check to throw if this case occurs.
    • Because the batch call actually seems like only a fix for HTTP, I think this is the real issue that's causing incorrect logs with the sync-stack (since the default is to use WS). This is in-line with what we're seeing on the logs of the Biomes indexer pod, where it doesn't show there are 0 logs, but that the storageAdapter is never called.

Copy link

changeset-bot bot commented Jun 13, 2024

⚠️ No Changeset found

Latest commit: 0aa7cc6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

const latestBlock$ = createBlockStream({ publicClient, blockTag: followBlockTag }).pipe(shareReplay(1));
const latestBlockNumber$ = latestBlock$.pipe(
map((block) => block.number),
tap((blockNumber) => {
debug("on block number", blockNumber, "for", followBlockTag, "block tag");
}),
filter((blockNumber) => {
return lastBlockNumberProcessed == null || blockNumber > lastBlockNumberProcessed;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure if we want to add a debug log here

@@ -90,7 +90,18 @@ export async function* fetchLogs<TAbiEvents extends readonly AbiEvent[]>({
try {
const toBlock = fromBlock + blockRange;
debug("getting logs", { fromBlock, toBlock });
const logs = await publicClient.getLogs({ ...getLogsOpts, fromBlock, toBlock, strict: true });

const [latestBlockNumber, logs] = await Promise.all([
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works if batch is enabled in the HTTP transport. This is decided by clients though, so maybe this is put into docs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have an option to enable the "validate unsynchronized nodes" behavior, where we require batch to be enabled or, ideally, do the batch call here directly rather than relying on some client configuration.

Since this approach isn't useful unless the call is atomic (going to the same RPC) and I suspect not all RPC providers support batching, seems like we should put this behind a flag to avoid doubling RPC calls for syncing data from RPC. Could make this the default approach with a way to opt out of this new behavior.

publicClient.getLogs({ ...getLogsOpts, fromBlock, toBlock, strict: true }),
]);
if (latestBlockNumber < toBlock) {
const blockTimeInSeconds = 2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the best way to handle this is, ie should it be a deploy config? I dont think there's a way to detect from the chosen chain what the block time is?

Copy link
Member

@holic holic Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use the client's pollingInterval here?

either way, we should probably align this with the other retry logic below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants