Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: stream SyncState response #685

Open
wants to merge 8 commits into
base: next
Choose a base branch
from

Conversation

TomasArrachea
Copy link
Collaborator

@TomasArrachea TomasArrachea commented Feb 11, 2025

closes #174.

This PR changes the SyncState endpoint to return a Stream of responses, instead of returning a single sync update. The node will push all the update responses into the stream. Once the last update in sent, the stream is closed.

The same SyncStateResponse is used, except the chain_tip field was removed since the client is no longer needed to check for it.

On miden-client #734 the client is updated to consume the stream and test the new behaviour.

@TomasArrachea TomasArrachea changed the title WIP: stream SyncState response feat: stream SyncState response Feb 14, 2025
notes,
nullifiers,
}))
let (tx, rx) = mpsc::channel(128); // TODO: check bound of the channel
Copy link
Collaborator Author

@TomasArrachea TomasArrachea Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still TBD, any ideas on what should be the channel size for the stream?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Each element is essentially at most one block's worth of delta's iiuc.

I would make it much smaller - like 5? @bobbinth

This also interacts with the timeout on the sender from my other comment.

@TomasArrachea TomasArrachea force-pushed the tomasarrachea-stream-sync-state branch from 346c5f3 to 14a8f1a Compare February 17, 2025 14:22
@TomasArrachea TomasArrachea marked this pull request as ready for review February 17, 2025 14:26
@tomyrd
Copy link
Collaborator

tomyrd commented Feb 18, 2025

I have a question regarding a small edge case that I had to deal with in the sync state component refactor in the client.

How do we deal with nullifiers for notes that we don't know that they exist that got created and nullified between the request block num and the chain tip?

The client can't add the nullifier to the request because it doesn't know of the note's existance so will it miss the nullifier update. Moreover, now that the client is synced to a block after the nullifier, it won't get the update in future requests (which would be the requests containing the new nullifier).

Here's a small diagram of the case:
image

Before, with separate requests, the client could update subsquent requests with the new information it got from the node.

@Mirko-von-Leipzig
Copy link
Contributor

I have a question regarding a small edge case that I had to deal with in the sync state component refactor in the client.

How do we deal with nullifiers for notes that we don't know that they exist that got created and nullified between the request block num and the chain tip?

The client can't add the nullifier to the request because it doesn't know of the note's existance so will it miss the nullifier update. Moreover, now that the client is synced to a block after the nullifier, it won't get the update in future requests (which would be the requests containing the new nullifier).

To me this sounds like a case for more separate queries. As in, maybe SyncState shouldn't include nullifiers - but rather they should be requested separately?

@igamigo
Copy link
Collaborator

igamigo commented Feb 18, 2025

To me this sounds like a case for more separate queries. As in, maybe SyncState shouldn't include nullifiers - but rather they should be requested separately?

I think this is probably the way to go, use CheckNullifiersByPrefix after a sync. It would force most syncs to imply two separate requests, which is probably fine since right now a sync requires potentially multiple requests until you get to the chain tip anyway.

For public notes the node could also track notes and add nullifiers, but for private notes it would require a separate request anyway.

@Mirko-von-Leipzig
Copy link
Contributor

@igamigo @TomasArrachea can I review this PR as is? Or will there be changes towards separating nullifiers etc out? The latter probably needs at least some broader discussion, e.g. maybe we should have more focussed streams?

@bobbinth
Copy link
Contributor

I think this is probably the way to go, use CheckNullifiersByPrefix after a sync. It would force most syncs to imply two separate requests, which is probably fine since right now a sync requires potentially multiple requests until you get to the chain tip anyway.

For public notes the node could also track notes and add nullifiers, but for private notes it would require a separate request anyway.

For private notes, state sync will not give us any new info about nullifiers. That is, the only info we get from state sync about private notes is whether a note appeared on chain or not. So, if we didn't know about the nullifier of a private note before we started the state sync process, we won't learn about it during the sync.

For public notes, the situation is different as we may get a public note we are interested in during the sync, and so would like to figure out if has already been consumed or not.

So, the process could be:

  • Prepare as set of nullifier prefixes for the state sync request as we do now.
  • Run the state sync process. This should give us nullifiers for all private notes we know about as well as about public notes which we knew about at the beginning of the sync.
  • For any new public note received during sync process, if the client is interested in that note, use CheckNullifiersByPrefix endpoint after the sync.

An alternative, as suggested above, could be to remove nullifiers from the state sync requests entirely and just use CheckNullifiersByPrefix. The main downside is that this could return a lot of data, but given that nullifiers are pretty small (i.e., 32 bytes), getting even 10K nullifiers would be less than 0.5MB of data.

@bobbinth
Copy link
Contributor

An alternative, as suggested above, could be to remove nullifiers from the state sync requests entirely and just use CheckNullifiersByPrefix. The main downside is that this could return a lot of data, but given that nullifiers are pretty small (i.e., 32 bytes), getting even 10K nullifiers would be less than 0.5MB of data.

One thing we can do to reduce the amount of data returned from the endpoint is to include block_num parameter into the CheckNullifiersByPrefix request. This way, only the nullifiers created after the specified block number would be returned.

@bobbinth
Copy link
Contributor

So, I'd probably break this down as follows:

  • in a separate PR, update CheckNullifiersByPrefix as described above.
  • Then, in a separate PR, remove nullifiers from the SyncState request/response, and update the client accordingly.
  • Then, finish this PR.

* feat: add block_num parameter to check_nullifiers_by_prefix

* chore: update CHANGELOG

* review: add `block_num` parameter as required

* review: update CHANGELOG

* review: update doc comment

* chore: update rpc README
@bobbinth
Copy link
Contributor

Could we cherry-pick eb2de00 and changes from #708 once they are merged into next? This way, they could be merged and propagated to the client before this PR is merged (we can then rebase this PR from next).

@@ -37,7 +37,7 @@ Returns a nullifier proof for each of the requested nullifiers.

### CheckNullifiersByPrefix

Returns a list of nullifiers that match the specified prefixes and are recorded in the node.
Returns a list of nullifiers recorded in the node that match the specified prefixes and block creation height.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to match block creation height, right? If so we should change this to reflect the response will contain nullifiers created at any block larger or equal than the passed one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed - let's make it more precise - i.e., the response will contain nullifiers created in or after the specified block.

@TomasArrachea
Copy link
Collaborator Author

TomasArrachea commented Feb 20, 2025

Could we cherry-pick eb2de00 and changes from #708 once they are merged into next? This way, they could be merged and propagated to the client before this PR is merged (we can then rebase this PR from next).

Actually #707 was merged into this branch, not into next. So I should create a new PR to merge the changes from #707 into next, right?

* feat: add block_num parameter to check_nullifiers_by_prefix

* chore: update CHANGELOG

* review: add `block_num` parameter as required

* review: update CHANGELOG

* review: update doc comment

* feat: remove nullifiers from `SyncState` endpoint

* chore: update doc comments

* chore: update CHANGELOG

* chore: update rpc README
@TomasArrachea TomasArrachea marked this pull request as ready for review February 24, 2025 22:23
transactions,
notes,
}))
let (tx, rx) = mpsc::channel(128); // TODO: check bound of the channel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these are the common names for channel halves, but consider different naming because we have many transactions within miden.

Maybe sender, receiver or tx_stream.

}

let result = async {
let account_ids: Vec<AccountId> = read_account_ids(&request.account_ids)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should move out of the loop and spawn I think - this only has to happen once correct? And if it errors then its a bad sync request and we should reject it immedietely.

Comment on lines 171 to 177
let chain_tip = state.latest_block_num().await.as_u32();
tokio::spawn(async move {
loop {
if last_block_num == chain_tip {
// The state is up to date, no need to sync
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what we want this logic to look like. Right now this locks the chain_tip it place at the start of the stream.

But of course the chain will progress while we are syncing, and ideally we would sync until we hit the actual tip of the chain.

A more robust way would be to query until the store returns an empty result.

let (state, delta) = state
.sync_state(last_block_num.into(), account_ids, request.note_tags.clone())
.await
.map_err(internal_error)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this at the send location - that way the reader knows all errors are internal errors at the send location.

}
let is_error = result.is_err();

tx.send(result).await.expect("error sending sync state response to the client");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen if the client closes its connection. In other words, this can be caused by an external force - and shouldn't cause a panic for us. Rather if let Err(err), log the closure (probably at info) and break the loop.

aka this isn't actually an error, but rather just indicates the connection was lost which is entirely valid. I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also block's correct? We should wrap it in a timeout to prevent a client from slow-rolling us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only blocks when there is no more space on the channel, so it's related to the bound size discussion. Anyway I added a 10 secs timeout in c3b242b, along with some error handling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its also a DoS vector because without a timeout we just infinitely block. So a bad client could just open a million of these and do nothing, blocking us.

notes,
nullifiers,
}))
let (tx, rx) = mpsc::channel(128); // TODO: check bound of the channel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Each element is essentially at most one block's worth of delta's iiuc.

I would make it much smaller - like 5? @bobbinth

This also interacts with the timeout on the sender from my other comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants