Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

75/WAKU2-SYNC #660

Closed
wants to merge 11 commits into from
Closed

75/WAKU2-SYNC #660

wants to merge 11 commits into from

Conversation

ABresting
Copy link
Contributor

This is the initial version of Sync RFC supported by a Technical PoC. The Prolly trees used are complete and flexible for future changes. This RFC is developed using the following research brainstorming:
waku-org/research#78
waku-org/research#62

Comment on lines 2 to 3
slug: "74"
title: 74/WAKU2-MESSAGE-RELIABILITY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ABresting This rfc should be 75.

@ABresting ABresting requested a review from jimstir January 20, 2024 05:07
@ABresting ABresting changed the title Sync RFC initial version 75/WAKU2-MESSAGE-RELIABILITY Jan 20, 2024
@@ -0,0 +1,173 @@
---
slug: "75"
title: 75/WAKU2-MESSAGE-RELIABILITY

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this spec for sync protocol or there is another one?
Message reliability is a use case of sync protocol, if no other spec, I think the name of this spec should be something like sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be, but in the research direction doc it is referred under Message-Reliability umbrella.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical for raw spec, but although this is indeed part of the overall track towards message reliability, this spec only introduces a new sync protocol and should be titled accordingly - something like WAKU2-SYNC. A separate spec will then take concepts introduced here and in other protocols to describe WAKU2-MESSAGE-RELIABILITY in a consolidated spec.


Sync protocol enables a node to synchronize itself with messages that it may have missed due to any reason, from one of its peers. This is an efficient method for the nodes present in the network to stay synchronized while sharing minimal information with each other.

This document describes the Sync store protocol and how it is used to synchronize the nodes in the network. It also explains the various components of the protocol and how they interact with each other. The detailed explanation of the protocol is given in the [research doc](https://github.com/waku-org/research/issues/78). In a Sync request-response protocol, a client receives the messages through Sync mechanism using their peer nodes only. Store is planned to work in conjunction with Sync to provide a reliable message delivery mechanism.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Link the issue seems not friendly as there are discussions and everyone who wants to understand the spec needs to read it, should we make the spec self-contained for such information?
  • Sync store protocol is confusing, in my understanding sync protocol can be implemented by any nodes in waku network, and store node is the one implements sync protocol by default. It's probably better to structure this spec by making this point explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very useful feedback here:

  1. In the research issue mentioned above, the main content of the issue was modified based on the comments/discussions. So with time the conclusion of the discussions were integrated in the main issue content. Should we open another sanitized issue if this one seems polluted a bit?
  2. During the discussions it was noted that for v0 at least Store node should be the only one implementing it BUT I get the point and IMO it is convenient with the current Prolly tree design that Sync can also be part of the non-Store nodes (it comes with added work of taking messages from relay and then making a tree).

It's probably better to structure this spec by making this point explicitly.

As leadership decides, I can amend the RFC -> any node can run Sync protocol.

Copy link

@kaichaosun kaichaosun Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open another sanitized issue if this one seems polluted a bit?

Why not put the conclusions from the issue in this spec? and update this spec when new feature/fix needs to be included.

As leadership decides, I can amend the RFC -> any node can run Sync protocol.

I think the community knows the answer better than the leadership :)
Definitely need more input about this from other devs/researchers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put the conclusions from the issue in this spec? and update this spec when new feature/fix needs to be included.

that sounds cool, I can maybe just TLDR; issue into this spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Indeed. Specs should preferably not refer to resources outside of the body of specifications that its building on. However, since we're quite informal for the raw spec, I think it's OK to be more focused on how the protocol itself works and less on the contextualisation - in other words, I'd suggest simply removing the entire line "The detailed explanation...". The necessary context can grow in this spec in successive iterations.
  2. I would just call this something like "Waku Sync Protocol" and not add any conclusions as to what type of nodes can or cannot run the protocol - the only requirement is that the node must be keeping track of a list of message hashes in a prolly tree (which will have to be specified too, but that spec can be TBD for now).

a client receives the messages through Sync

https://rfc.vac.dev/spec/14/#deterministic-message-hashing message hashes?

Store is planned to work in conjunction with Sync to provide a reliable message delivery mechanism.

It's OK to add outside context in a raw spec. Note however that this spec should ultimately not know about "reliable message delivery" or "store" protocol, but simply be a minimalist protocol description that only "knows" about message hashes in a prolly tree. I think this would make the spec simpler to understand too, simplifying its goal. I would suggest just removing this line.

content/docs/rfcs/75/README.md Outdated Show resolved Hide resolved
}

// Request message for GetRoot RPC. No fields needed for this example.
message GetRootRequest {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ExchangeRootRequest by sending local root to a remote peer? this will give flexibility for scenarios which don't depend on any store nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will give flexibility for scenarios which don't depend on any store nodes.

I am not sure I am following this part? could you please elaborate more on this point?

Just to provide more context, a client sends a request to a server (here client and server are peer nodes) to get its root so, here the requesting node (client) holds the responsibility of computation based on root received. For server it should be a passive process. With ExchangeRootRequest I looks like client is also sending it's root, which should not be the case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about this scenario, in a p2p network, when peers comes online and connected with each other, each peer shares the root of the Prolly Tree, so that the other knows who is ahead, and decides which one should be client and which one should be server.

If p2p connection could happen in contentTopic level (I believe currently p2p connection attached if peers share the same pubsub topic), it will make community based sync possible and remove the dependency for store node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that the other knows who is ahead, and decides which one should be client and which one should be server.

so basically, when an operator goes offline, they somehow know to have missed messages while they were offline, in this case the operator sends a query request based on the last available timestamp and gets messages in the old traditional way. Sync is highly useful in the case where the connection is ad-hoc that an operator might miss some messages if not all. This type of Sync request/root-check should be done in a cronjob type within certain intervals.

it will make community based sync possible and remove the dependency for store node.

This seems like a multidimensional question, you are looking at it from Status/chat perspective, it should be dealt in app level policies in Waku client of how and where to create the Prolly tree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible to save bandwidth by having the first message contain the requester's own root node. This means that initially the protocol doesn't quite behave as client-server in the first message. Each node can use this to periodically request the roots of peers to determine if it has any gaps. The responder can use this first message to also determine if it has any gaps. Both sides then have the option to initiate the sync process, but this time as client of the interaction until it has retrieved the hashes of missed messages.


Step 3: A client node sends a request to the server node to get the root node of the Prolly tree. The server node responds with the root node of the Prolly tree.

Step 4: If the height of the root of the Prolly tree is different then the client node sends a request to the server node to get the root node of the Prolly tree at the height of the root of the Prolly tree. The server node responds with the root node of the Prolly tree at the height of the root of the Prolly tree.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get the root node of the Prolly tree at the height of the root of the Prolly tree.

Can you elaborate a bit more to help me understand?

Copy link
Contributor Author

@ABresting ABresting Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, in current Prolly tree implementation, if client has 5 elements (key-value pair nodes) then those 5 nodes must be present until a certain height (the process of a node promotion to a height in Prolly tree), and in any other Prolly tree in the network with any number of key-value pair nodes the comparison must start from the lowest common height only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also amended the RFC Step 4 for better clarity.


Upon receiving the missing messages that are present in the Prolly tree, the client node request the onward messages using the existing Store [method](https://github.com/waku-org/nwaku/blob/master/waku/waku_archive/driver.nim#L36) since these nodes are for sure missing from the client Prolly tree.

# Security/Privacy Considerations

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Waku has validation around messages? If any, probably worth to link it here. This will helps to resolve the concerns around the malicious sync service peers. (I like the word service in stead of server 😄 )

I think the sync protocol is also related to the content topics, and can be linked too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO in Waku client code there is a validation of any message when it is put into Store. Sync protocol can inherit this advantage.

I think the sync protocol is also related to the content topics, and can be linked too.

You mean Sync based on content_topic as in libp2p construct? if so, then it can be achieved in 2 ways:

  • Make Prolly tree of messages filtered on a content topic.
  • Make another Prolly tree based on content topics as key, and then do an intersection of both the results (timestamp and content_topic prolly tree output).


The Sync Process works in following steps:

Step 1: A key-value store is created using the Store protocol where the key is the timestamp of a message, and the value is the `message_hash` attribute which uniquely identifies a message in Waku network.
Copy link

@kaichaosun kaichaosun Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have concern about using timestamp as the key. some know issues like,

  • duplicated timestamp
  • malicious peers arbitrary set timestamps
  • result in unbalanced tree with discontinued timestamps

I haven't go into the details of this spec's implementation, would be helpful if you already thought through these concerns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, but like you said they have been thought through in the design discussions. Brief comments:

  • Duplicated timestamps can be a feature where the value part of key-value store value can be a list of message_hash that belongs to same timestamps. It will be a rare event considering the nanoseconds unix timestamp used in TWN. The Prolly tree can easily be extended to support this.
  • malicious peers arbitrarily set timestamps: This is also handled at the relay/store level as messages outside 20sec lag are discarded anyways from a Node's perspective, also RLN might help eradicate this issue to a certain degree.
  • unbalanced tree: there is no design based on pockets of epochs or timestamps that one pocket see high volume of messages and others see a low volume. Tree is balanced probabilistically based on individual key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be a rare event considering the nanoseconds unix timestamp used in TWN

Note that this should not be a rare event. Even though the timestamp field has nanosecond resolution, the populated timestamps will have coarse resolution to prevent timing attacks (e.g. 500ms boundaries with many messages having matching timestamps).

Maliciously setting timestamps will indeed be handled on the relay layer (validating messages, epoch, etc. via RLN).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated timestamps can be a feature where the value part of key-value store value can be a list of message_hash that belongs to same timestamps.

Make a list of messages hashes for the same timestamp adding complexity to resolve messages, and not using the great value (support frequent and arbitrary places insertion, or am I misunderstand it?) provided by Prolly Tree.
Time range based query is a unstable and only convenient for client to be online again and fast catchup. It's not the focus of Sync Protocol to help with the time range based query, and should not depend on this query for the tree's integrity.

Is there any concern not use message hash directly instead of using timestamp?

I'm also wondering what's the performance for insertion if there are tens of thousands even millions of messages. It can be mitigated by only construct the tree within last a few months though.

@ABresting ABresting requested a review from kaichaosun January 21, 2024 16:01
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I have added a couple of questions and comment as starting point. The most important is to add some (simple) description of how the tree traversal actually works, which message types are involved, etc.

@@ -0,0 +1,173 @@
---
slug: "75"
title: 75/WAKU2-MESSAGE-RELIABILITY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical for raw spec, but although this is indeed part of the overall track towards message reliability, this spec only introduces a new sync protocol and should be titled accordingly - something like WAKU2-SYNC. A separate spec will then take concepts introduced here and in other protocols to describe WAKU2-MESSAGE-RELIABILITY in a consolidated spec.


# Abstract

Sync protocol enables a node to synchronize itself with messages that it may have missed due to any reason, from one of its peers. This is an efficient method for the nodes present in the network to stay synchronized while sharing minimal information with each other.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics, but wouldn't it be truer (and make the spec simpler in scope) to explicitly say something like "sync protocol enables a node to synchronize the set of Waku message hashes that it it may have missed". This spec can then focus purely on the problem of message hashes. For clarity, a section on how this could be used to synchronize stored messages could be added, but the bulk of the spec could just describe "hash syncing".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also possible to link to https://rfc.vac.dev/spec/14/#deterministic-message-hashing to describe this newly introduced concept of "message hashes".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: we use semantic breaks between each sentence (or full clause). These do not render in the spec, but makes it much easier to add review comments or edit specific lines without having to work with paragraphs.


Sync protocol enables a node to synchronize itself with messages that it may have missed due to any reason, from one of its peers. This is an efficient method for the nodes present in the network to stay synchronized while sharing minimal information with each other.

This document describes the Sync store protocol and how it is used to synchronize the nodes in the network. It also explains the various components of the protocol and how they interact with each other. The detailed explanation of the protocol is given in the [research doc](https://github.com/waku-org/research/issues/78). In a Sync request-response protocol, a client receives the messages through Sync mechanism using their peer nodes only. Store is planned to work in conjunction with Sync to provide a reliable message delivery mechanism.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Indeed. Specs should preferably not refer to resources outside of the body of specifications that its building on. However, since we're quite informal for the raw spec, I think it's OK to be more focused on how the protocol itself works and less on the contextualisation - in other words, I'd suggest simply removing the entire line "The detailed explanation...". The necessary context can grow in this spec in successive iterations.
  2. I would just call this something like "Waku Sync Protocol" and not add any conclusions as to what type of nodes can or cannot run the protocol - the only requirement is that the node must be keeping track of a list of message hashes in a prolly tree (which will have to be specified too, but that spec can be TBD for now).

a client receives the messages through Sync

https://rfc.vac.dev/spec/14/#deterministic-message-hashing message hashes?

Store is planned to work in conjunction with Sync to provide a reliable message delivery mechanism.

It's OK to add outside context in a raw spec. Note however that this spec should ultimately not know about "reliable message delivery" or "store" protocol, but simply be a minimalist protocol description that only "knows" about message hashes in a prolly tree. I think this would make the spec simpler to understand too, simplifying its goal. I would suggest just removing this line.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).

Consider a request-response protocol with two roles: a client and a server.
Client and Server both MUST be peers supporting the Waku Sync protocol. There is no other eligibility for the node at this point in time to be a client or a server who can synchronize with each other.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't another requirement be that each node MUST contain an empty or populated prolly tree of message hashes (structure specification TBD) that it wants to synchronize with the peer?

int64 timestamp = 2; // Timestamp or other unique identifier
bytes node_hash = 3; // Hash of the node
int32 level = 4; // Level of the node in the tree
bytes merkel_hash = 5; // Merkel hash of the node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: I think the spelling is "merkle" and not "merkel", unless you're referring to the previous chancellor of Germany. :)

// Relationships (IDs or indices of related nodes)
// Note: Relationships like left, right, up, and down are typically represented using references
// (like indices or IDs) rather than directly embedding the objects to avoid deep nesting.
// In our case we can use the key of the node itself and figure out the level based on the current node level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would key here be a timestamp? Wouldn't that imply that the references should be int64?


The Sync Process works in following steps:

Step 1: A key-value store is created using the Store protocol where the key is the timestamp of a message, and the value is the `message_hash` attribute which uniquely identifies a message in Waku network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be a rare event considering the nanoseconds unix timestamp used in TWN

Note that this should not be a rare event. Even though the timestamp field has nanosecond resolution, the populated timestamps will have coarse resolution to prevent timing attacks (e.g. 500ms boundaries with many messages having matching timestamps).

Maliciously setting timestamps will indeed be handled on the relay layer (validating messages, epoch, etc. via RLN).


Step 3: A client node sends a request to the server node to get the root node of the Prolly tree. The server node responds with the root node of the Prolly tree.

Step 4: As Prolly tree is designed to compare diff based on equal hieght so if the height of the root of the Prolly tree is different then the client node sends a request to the server node to get the root node of the Prolly tree at the height of the root of the client's Prolly tree. The server node responds with the root node of the Prolly tree at the height of the root of the client's Prolly tree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions:

  1. Why doesn't the client ask for the root at the desired height in the first request message already?
  2. Couldn't there be more than one node at the requested height? Which one does the server respond with or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client ask for the root at the desired height in the first request message already?

that's an excellent suggestion, I see no reason why not!

Couldn't there be more than one node at the requested height? Which one does the server respond with or am I missing something?

The root of a Prolly tree is tail/fake node. Imagine a Tree X contains only 5 levels, then it means, 5th level has a single node below this level (level 4) it will have intermediate nodes along with a right most tail node ofcourse. Now, Tree Y contains 7 levels so to find the diff Y's level 5 tail/fake node will be sent as a root, not its intermediate nodes at level 5. With this property, a diff is found out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Thanks, I think this make sense. In other words, this only works if one tree really compares a true root to the "fake" root of another tree? If both, for some reason, compares a fake root (which is in fact a tail at a certain level), they may see a false match.

Perhaps worth explicitly stating in the spec what the definition of "root at desired height" means.

Copy link
Contributor Author

@ABresting ABresting Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fake root concept explanation should go into the Prolly Tree Spec and not in this spec. I will briefly mention something like this:

Root of the Prolly tree is used in this Sync mechanism, which is a Merkle tree with a probabilistic data structure.

As per the Prolly tree design, there is a fake root present at every level/height of the Prolly tree, to Sync properly both roots must be from the same height/level inside the Prolly tree.


Step 4: As Prolly tree is designed to compare diff based on equal hieght so if the height of the root of the Prolly tree is different then the client node sends a request to the server node to get the root node of the Prolly tree at the height of the root of the client's Prolly tree. The server node responds with the root node of the Prolly tree at the height of the root of the client's Prolly tree.

Step 5: The client node then traverses the Prolly tree from the root node to the leaf nodes and finds the missing nodes at every level of the tree starting from the top. The missing intermediate Prolly tree Node is requested by client to the server. Server responds with the requested nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to expand this step and explain the messages, field and logic involved:
e.g. if the client detects a difference in root, it requests the children of that node using message_x. For each child node that differs, it continues using message_x to retrieve child nodes until...

}

// Request message for GetRoot RPC. No fields needed for this example.
message GetRootRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible to save bandwidth by having the first message contain the requester's own root node. This means that initially the protocol doesn't quite behave as client-server in the first message. Each node can use this to periodically request the roots of peers to determine if it has any gaps. The responder can use this first message to also determine if it has any gaps. Both sides then have the option to initiate the sync process, but this time as client of the interaction until it has retrieved the hashes of missed messages.

@ABresting ABresting changed the title 75/WAKU2-MESSAGE-RELIABILITY 75/WAKU2-SYNC Jan 23, 2024
@ABresting
Copy link
Contributor Author

ABresting commented Jan 23, 2024

I think it's possible to save bandwidth by having the first message contain the requester's own root node. This means that initially the protocol doesn't quite behave as client-server in the first message. Each node can use this to periodically request the roots of peers to determine if it has any gaps. The responder can use this first message to also determine if it has any gaps. Both sides then have the option to initiate the sync process, but this time as client of the interaction until it has retrieved the hashes of missed messages.

I think it is a valid point. It can be step number 1 where two nodes exchange their roots along with height of the root. Based on this they decide if they want to Sync or not.

One thing to note, if two Prolly Trees are different at the same height then they must Sync. In this case, the sync mechanism for each node will work exclusively and there is less scope for piggybacking on network packet level. So they anyways they follow separate flow. Which is fine. Included this in the new PR. @jm-clius

@ABresting ABresting requested a review from jm-clius January 23, 2024 18:20
@kaichaosun kaichaosun mentioned this pull request Jan 26, 2024
3 tasks
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additions! I think it will be helpful to break Step 5 down into substeps, e.g.:

If the requester detects a mismatch in merkle hash, it requests the children of the mismatching node by sending with

.
It does so iteratively for every mismatch at every level.
It continues this process until it receives with

// Request message for GetRootAtHeight
message ExchangeRootAtHeightRequest {
Node root = 1; // The root node of the Prolly tree
int32 height = 1; // The desired height
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see any reason for this height to be different to the level inside the Node field? If not, perhaps the protocol can stipulate that the desired height is equal to the level of the root node in the request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! let me make this change.

If the height at server is less than the height requested by the client, then the server responds with the root node of the Prolly tree at the maximum height present at the server.

Step 5: The client node starts traversing the Prolly tree from the root node to the leaf nodes and finds the missing nodes at every level of the tree starting from the top.
If the client detects the difference of merkle hash at a key of a certain level, then it requests the server node to get the intermediate child node belonging to that key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure I'm clear what messages are involved and how those fields are populated? What is the first message that a requesting node sends to discover the children of the mismatching node?

Copy link
Contributor Author

@ABresting ABresting Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested I have amended the Steps a bit:

Step 5: The client node starts traversing the Prolly tree from the root node to the leaf nodes and finds the missing nodes or nodes with different merkle hash at every level of the tree starting from the top.
5a) If the client detects the missing key or difference of merkle hash at a key of a certain level, then it requests the server node to get the intermediate child node belonging to that key.

5b) And the keys with no difference in merkle hash are skipped.

This step continues iteratively until the client reaches the leaf nodes of the tree.

@ABresting ABresting requested a review from jm-clius January 26, 2024 18:50
int64 timestamp = 2; // Timestamp or other unique identifier
bytes node_hash = 3; // Hash of the node
int32 level = 4; // Level of the node in the tree
bytes merkle_hash = 5; // Merkle hash of the node

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explicitly make the definition of Merkle hash in this spec? It's a still unclear to me after search a few places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a brief explanation.

content/docs/rfcs/75/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/75/README.md Outdated Show resolved Hide resolved

The Sync Process works in following steps:

Step 1: A key-value store is created using the Store protocol where the key is the timestamp of a message, and the value is the `message_hash` attribute which uniquely identifies a message in Waku network.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated timestamps can be a feature where the value part of key-value store value can be a list of message_hash that belongs to same timestamps.

Make a list of messages hashes for the same timestamp adding complexity to resolve messages, and not using the great value (support frequent and arbitrary places insertion, or am I misunderstand it?) provided by Prolly Tree.
Time range based query is a unstable and only convenient for client to be online again and fast catchup. It's not the focus of Sync Protocol to help with the time range based query, and should not depend on this query for the tree's integrity.

Is there any concern not use message hash directly instead of using timestamp?

I'm also wondering what's the performance for insertion if there are tens of thousands even millions of messages. It can be mitigated by only construct the tree within last a few months though.

Client sends its own local root node of the Prolly tree.
With client's root node, server can calculate the height inside the Prolly tree from where client wants to Sync.

Step 4: The server node responds with the root node of the Prolly tree present at the same height as client's root.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this called root node in the Prolly tree, or do you mean tail node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, every level of the Prolly tree contains a tail/fake node, at the top level there is only one Node left i.e. tail/fake node which we call Root node. These details should go inside the Prolly Tree Spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concern not use message hash directly instead of using timestamp?

mostly has to do with the searching and traversing the prolly tree, so having a key with timestamp provides more structure to the data at Prolly tree level.

I'm also wondering what's the performance for insertion if there are tens of thousands even millions of messages. It can be mitigated by only construct the tree within last a few months though.

insertions, deletion, update etc are log-based complexities, which is good enough considering the benefits. The width of the tree i.e. data, months, days and date is still an open question, but afaik, full nodes can choose more width since search with lower width trees works just fine considering Prolly tree design.

ABresting and others added 3 commits January 29, 2024 16:07
@ABresting ABresting requested a review from kaichaosun January 29, 2024 11:25
@ABresting ABresting mentioned this pull request Jan 31, 2024
@jimstir jimstir mentioned this pull request Mar 7, 2024
@jimstir
Copy link
Contributor

jimstir commented Mar 7, 2024

Continue discussion at waku-org/specs#7. Some Waku RFCs are being moved to waku/specs repo as there will be a new RFC process. The raw status and assigned slug number is removed as new waku/specs do not need to follow 1/COSS.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants