Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add StateSync component #650

Open
wants to merge 30 commits into
base: new-state-sync
Choose a base branch
from

Conversation

tomyrd
Copy link
Collaborator

@tomyrd tomyrd commented Jan 2, 2025

This PR adds a new StateSync component that encompasses all the logic for the client's synchronization with the node. The idea is that this component can be instantiated separate from the client and can be ran without modifying the state of the client. The component exposes a sync_state method that will do the necessary rpc calls and return a StateSyncUpdate with all the changes that should be applied to the store.

state_sync method

The new state_sync method no longer exposes the "steps" of the rpc response. It returns all the necessary updates to be up to date with the node in a single call.

Additionally, the component doesn't have direct access to the store. This is because the sync process now is indepentent on the state of the store and users can pass the information they want the component tu use and simulate syncs without the store changing. For this reason, the client will gather all the needed information up front and give it to the component, including the headers of every tracked account, all the unspent notes and the current partial MMR. The only way the store can be accessed during the sync process is using the callbacks explained in the next section.

Callbacks

The component receives two callbacks that will be called in specific events during the sync:

  • OnNoteReceived: This callback will be called on each new committed note that is received in the sync state endpoint response. It receives a CommittedNote that contains the Id of the note being committed and an optional InputNoteRecord that corresponds to the state of the note in the node (only if the note is public), this allows the callback to check whether the new public note is relevant to the client or not. The callback should return a boolean indicating if this new CommittedNote event should be applied to the store or not. The default implementation is on_note_received.
  • OnNullifierReceived: This callback will be called on each new nullifier that is received in the sync state endpoint response. It receives a NullifierUpdate that contains the new nullifier along with its block number. The callback should return a boolean indicating if this new NullifierUpdate event should be applied to the store or not. There's no default implementation, the client will simply return true for every nullifier update.

Other changes

  • The rpc_api in the client was changed from Box to Arc so it can be shared with the component. This also means that I had to change the base struct to give it interior mutability.
  • I moved most of the code from crates/rust-client/src/sync/mod.rs to the new crates/rust-client/src/sync/state_sync.rs.

TODOs

  • Rebase the branch to bring the changes from the RPC functions (some conflicts will probably emerge but they shouldn't be too hard to solve)
  • Since I'm already refactoring the sync process, I want to move the apply_mmr_changes part inside the component. I feel like it could be better documented and it should fit nicely inside the note_state_sync logic (it's just a different note state change).
  • Remove all the temporary TODOs in the code.
  • Once I'm finished with the changes above, I need to modify the web store to be compatible with the new functionality.
  • Fix onchain tag integration test.

@tomyrd tomyrd force-pushed the tomyrd-sync-component-alt branch from 8731b2b to d54e186 Compare January 2, 2025 23:05
@tomyrd
Copy link
Collaborator Author

tomyrd commented Jan 8, 2025

I need to tidy up the code and update the documentation, but the basic idea with this updated version is that we have callback functions (defined in state_sync.rs) that receive the unprocessed sync response and return the updates needed to be made in the store. These callbacks can be defined as closures that have access to the rpc_api and store and could be redefined by the user to affect how the changes are applied to the store at the end.

These callbacks could also work as a sort of event reaction, as the user could redefine them and they would be called on each sync with all the new and updated node information.

@tomyrd tomyrd force-pushed the tomyrd-sync-component-alt branch from 6baeafa to 50938fb Compare January 9, 2025 02:42
@tomyrd tomyrd force-pushed the tomyrd-sync-component-alt branch from 50938fb to 2f26007 Compare January 9, 2025 14:50
Comment on lines +165 to +173
for (account_id, digest) in account_updates.mismatched_private_accounts() {
// Mismatched digests may be due to stale network data. If the mismatched digest is
// tracked in the db and corresponds to the mismatched account, it means we
// got a past update and shouldn't lock the account.
if let Some(account) = self.get_account_header_by_hash(*digest).await? {
if account.id() == *account_id {
continue;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not be done before calling the store? Then it would be deduplicated from the codebase, or am I missing something?

Copy link
Collaborator Author

@tomyrd tomyrd Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not be done before calling the store?

Not with the current structure, because the SyncStateUpdate is built inside the StateSync component which doesn't have access to the store. But the code duplication isn't ideal, I'll add it to the follow ups. If we want to access the store inside the process, we should add a callback (maybe we can re-add this one).

.updated_input_notes()
.iter()
.chain(committed_notes.new_input_notes().iter())
.chain(note_updates.new_input_notes().iter())
.filter(|note| note.is_committed())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Collaborator Author

@tomyrd tomyrd Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, the check_block_relevance only received committed note updates, now it receives every note update, and some of those input note records can't be converted to Note (if the note was consumed externally it doesn't store its metadata).

@tomyrd tomyrd marked this pull request as ready for review January 10, 2025 21:12
@igamigo igamigo linked an issue Jan 10, 2025 that may be closed by this pull request
@igamigo igamigo changed the title feat: add StateSync component (alternative) feat: add StateSync component Jan 13, 2025
@igamigo
Copy link
Collaborator

igamigo commented Jan 14, 2025

I've fixed some of the comments that I had on my pending review and refactored the code a little bit. I've also added a comment on the follow-up issue (#663). One thing to note is that the StateSync component is not currently a part of Client, but rather it a default implemnetation is instantiated whenever we want to do a sync. It can also be used separately of course, but there is no way to provide a custom StateSync to a Client to compose it with the other client features. Not sure if this was the intention of this first iteration or not, but mentioning it just in case. cc @bobbinth

Comment on lines 197 to 205
pub async fn sync_state_step(
&self,
current_block: BlockHeader,
current_block_has_relevant_notes: bool,
current_partial_mmr: PartialMmr,
accounts: Vec<AccountHeader>,
note_tags: Vec<NoteTag>,
unspent_nullifiers: Vec<Nullifier>,
) -> Result<Option<SyncStatus>, ClientError> {
Copy link
Contributor

@bobbinth bobbinth Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: maybe I missed this, but I thought we were going to do a full sync (not just one step) and then return the result of the entire sync from this function. Then, the result would be inserted into the store.

What is the motivation for exposing step-wise syncing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not entirely sure what the original motivation here was. One thing that comes to mind is that we get the new state from the store each time we do a new step. This involves the block number, account headers and note tags. I think it shouldn't be too difficult to make it so that we can expose the full sync instead though there is probably some logic that we need to add in order to make it keep a consistent state within the component. If everything else looks OK I could make another PR with these changes (unless you prefer to have them in this PR instead)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the end goal is to just do the entire process in one go so that the user is not exposed to the step logic (which could potentially change in the future). We could even make this a bit more customizable by allowing the user to sync to a specific block (rather than the end of the chain) but that would require changes on the node side as well.

In terms of this PR vs. another PR - we could probably create a new branch - e.g., new_state_sync, merge this PR into that branch, and then make another PR against that branch to abstracts away the step-wise updates).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also think this should be added as a separate PR, as the component will need to change to be able to track the intermediate changes between steps. Will add this to the follow up issue and will create the new_state_sync branch that will envelop all changes (this and the follow ups).

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Not a review yet, but I left a couple of comments inline. We should also rebase this from the latest next.

In terms of merging, this will go into the v0.8.0 release - so, no need to merge before we release v0.7.0.

Comment on lines 148 to 152
on_note_received: OnNoteReceived,
/// Callback to be executed when a transaction is committed.
on_committed_transaction: OnTransactionCommitted,
/// Callback to be executed when a nullifier is received.
on_nullifier_received: OnNullifierReceived,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: naming here is a bit inconsistent. To make it consistent we could change on_committed_transaction to on_transaction_committed.

Comment on lines 483 to 485
/// Applies changes to the current MMR structure, returns the updated [MmrPeaks] and the
/// authentication nodes for leaves we track.
pub(crate) async fn apply_mmr_changes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this is a helper function (rather than a callback handler) - so, we should probably move it to a different section.

Comment on lines 197 to 205
pub async fn sync_state_step(
&self,
current_block: BlockHeader,
current_block_has_relevant_notes: bool,
current_partial_mmr: PartialMmr,
accounts: Vec<AccountHeader>,
note_tags: Vec<NoteTag>,
unspent_nullifiers: Vec<Nullifier>,
) -> Result<Option<SyncStatus>, ClientError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the end goal is to just do the entire process in one go so that the user is not exposed to the step logic (which could potentially change in the future). We could even make this a bit more customizable by allowing the user to sync to a specific block (rather than the end of the chain) but that would require changes on the node side as well.

In terms of this PR vs. another PR - we could probably create a new branch - e.g., new_state_sync, merge this PR into that branch, and then make another PR against that branch to abstracts away the step-wise updates).

@tomyrd tomyrd changed the base branch from next to new-state-sync January 27, 2025 16:18
tomyrd and others added 2 commits February 7, 2025 18:39
* feat: remove steps from state sync interface

* feat: add block relevance in `StateSyncUpdate`

* fix: web client build

* chore: update docs

* add failing test related to TODOs

* fix: add nullifiers for new untracked notes in each step

* use `Arc<RwLoc<NoteUpdates>>` in callbacks

* feat: reduce the number of callbacks

* remove `NoteUpdates` as callback parameter

* refactor callbacks

* update docs

* change callbacks and move state transitions

* fix: web store

* refator: use partial mmr with current block in sync
@tomyrd tomyrd force-pushed the tomyrd-sync-component-alt branch from 6bd86f9 to c1281f0 Compare February 10, 2025 14:14
@tomyrd
Copy link
Collaborator Author

tomyrd commented Feb 10, 2025

This PR is stable now and ready for review, the changes in tomyrd-no-step-sync were merged here so that everything can be reviewed at once and the context isn't lost. I also updated the description to better explain the new changes.

Edit: after re-reading the comments here I found this:

One thing to note is that the StateSync component is not currently a part of Client, but rather it a default implemnetation is instantiated whenever we want to do a sync. It can also be used separately of course, but there is no way to provide a custom StateSync to a Client to compose it with the other client features.

Should this be added in this PR? I think the approach here is to add the component as a parameter of the Client's constructor, so the user can instantiate one with their preferred callbacks.

@tomyrd tomyrd requested review from igamigo and bobbinth February 10, 2025 18:32
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Not a full review but I left some comments inline. Overall, I think the approach works - but we need to clean this up quite a bit and make sure the comments are accurate.

Also, I would suggest moving out some "auxiliary changes" (e.g., implementing interior mutability) into another PR so that this PR could focus on the state sync related logic.

Comment on lines 79 to 80
/// Contains stats about the sync operation.
pub struct SyncSummary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would move this struct into its own section at the end of this file.

Comment on lines 30 to 32
#[derive(Default)]
/// Contains all information needed to apply the update in the store after syncing with the node.
pub struct StateSyncUpdate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this struct into the parent module (into its separate section).

Also nit: attributes should go after comments.

Comment on lines 76 to 82
/// Callback to be executed when a new note inclusion is received in the sync response. It receives
/// the committed note received from the node, the block header in which the note was included and
/// the list of public notes that were included in the block.
///
/// It returns two optional notes (one input and one output) that should be updated in the store and
/// a flag indicating if the block is relevant to the client.
pub type OnNoteReceived = Box<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments here are outdated.

Comment on lines 89 to 95
/// Callback to be executed when a nullifier is received in the sync response. It receives the
/// nullifier update received from the node and the list of transaction updates that were committed
/// in the block.
///
/// It returns two optional notes (one input and one output) that should be updated in the store and
/// an optional transaction ID if a transaction should be discarded.
pub type OnNullifierReceived =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments are outdated here. Also, why do we need this callback?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it, the Client just assumes all nullifiers are relevant (i.e. the default implementation always returns true). The purpose of this callback is so that the sync component is more extensible and so that users can react to nullifiers arriving in the sync process.

We could remove it if we want a simplified version of the component on this first iteration.

Comment on lines 208 to 220
/// Syncs the state of the client with the chain tip of the node, returning the updates that
/// should be applied to the store.
///
/// # Arguments
/// * `current_block` - The latest tracked block header.
/// * `current_block_has_relevant_notes` - A flag indicating if the current block has notes that
/// are relevant to the client. This is used to determine whether new MMR authentication nodes
/// are stored for this block.
/// * `current_partial_mmr` - The current partial MMR.
/// * `accounts` - The headers of tracked accounts.
/// * `note_tags` - The note tags to be used in the sync state request.
/// * `unspent_nullifiers` - The nullifiers of tracked notes that haven't been consumed.
pub async fn sync_state(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments here should be updated.

Comment on lines 37 to 39
#[derive(Debug, Clone)]
/// Represents a transaction that was included in the node at a certain block.
pub struct TransactionUpdate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: attributes should go after comments.

Comment on lines +48 to +54
pub async fn new(endpoint: Endpoint, timeout_ms: u64) -> Result<TonicRpcClient, RpcError> {
let endpoint = tonic::transport::Endpoint::try_from(endpoint.to_string())
.map_err(|err| RpcError::ConnectionError(err.to_string()))?
.timeout(Duration::from_millis(timeout_ms));
let rpc_api = ApiClient::connect(endpoint)
.await
.map_err(|err| RpcError::ConnectionError(err.to_string()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now establishes a connection on construction, right? Is there a way to avoid it? Basically, we may want to interact with the client in ways that don't require making request to the node, and this would force us to establish the connection for such requests as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #726

note_tags: &[NoteTag],
unspent_nullifiers: &[Nullifier],
) -> Result<bool, ClientError> {
let current_block_num = (current_partial_mmr.num_leaves() as u32 - 1).into();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not for this PR, but I would consider wrapping PartialMmr into a dedicated struct to make things like this easier (i.e., current_partial_mmr.chain_tip()). We may even be able to re-use ChainMmr struct from miden-base.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add this to the followup issue

Comment on lines 158 to 163
// To receive information about added nullifiers, we reduce them to the higher 16 bits
// Note that besides filtering by nullifier prefixes, the node also filters by block number
// (it only returns nullifiers from current_block_num until
// response.block_header.block_num())
let nullifiers_tags: Vec<u16> =
unspent_nullifiers.iter().map(get_nullifier_prefix).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this translation here? An alternative option is for the parent function to keep track of the relevant nullifier tags and to just pass the tags into this function.

Comment on lines 460 to 464
/// Default implementation of the [OnNoteReceived] callback. It queries the store for the committed
/// note and updates it accordingly. If the note wasn't being tracked but it came in the sync
/// response, it is also returned so it can be inserted in the store. The method also returns a
/// flag indicating if the block is relevant to the client.
async fn committed_state_transions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment here is for a different function.

@tomyrd
Copy link
Collaborator Author

tomyrd commented Feb 11, 2025

Just created two auxiliary PRs (#726 and #727) that will merge into the new-state-sync branch. Once this happens I'll merge those changes (along with the updates from next) into this branch. This will result in a better and smaller PR diff that will be easier to review.

You can review the new PRs first and wait to review this one until those are merged so that is easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor: separate client in more stand-alone components
3 participants