-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add rolling bloom filter, reliability utils and protobuf #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for it! Some comments so far :) Ping me if any comment is unclear or you need me to review again
src/protobuf.nim
Outdated
var intArray = newSeq[int](bytes.len div sizeof(int)) | ||
for i in 0 ..< intArray.len: | ||
let start = i * sizeof(int) | ||
copyMem(addr intArray[i], unsafeAddr bytes[start], sizeof(int)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks beautiful but a couple of doubts:
- Do we have a Nim module that already does that?
- Should we care about the desired endianness?
wdyt @arnetheduck ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out.
- For this use-case and from what I saw, this seems to be the efficient built-in option, but curious as well to know if there's anything else
- Updated to handle endianness. Does it look good now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good! I've requested a change wrt the rolling bloom filter design being based on duration - I think we'll get more predictable behaviour if we simply design it based on a maximum capacity with some variability.
src/rolling_bloom_filter.nim
Outdated
for msg in rbf.messages: | ||
if msg.timestamp > cutoff: | ||
newMessages.add(msg) | ||
newFilter.insert(msg.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the number of messages within the cutoff window is much higher than our capacity
? From glancing at https://github.com/waku-org/nim-sds/blob/master/src/bloom.nim I don't think there will be any errors, but our false positive rate design will be way off.
To me it seems unnecessary to bring a time window consideration into the design here at all - we want SDS to work the same for a group communication at any rate. Perhaps therefore the rolling filter can operate on a "min" and "max" number of entries around the configured capacity
? E.g. we can allow up to capacity + 20%
of messages before triggering a clean, after which we add back the last capacity - 20%
of messages? Something like that, but in any case avoiding the highly variable time element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you said. The time-based approach might introduce unnecessary variability. I've updated the implementation to use capacity-based rolling filter with configurable thresholds for min and max (currently set to 20%)
Side note: I think it's a good idea to mention the deliverable that this forms part of in the PR description. :) waku-org/pm#194 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Minor comments below, but no need to rerequest approval once addressed. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for it! Some more comments :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for it! Super insightful changes 🥳
Just adding some more nitpick comments
Something important is to start using nph . Ping me anytime and I can explain how we do in nwaku.
Besides, I think is interesting to start separating modules and avoid having too generic modules such as reliability_utils.nim
, and also I encourage to use private attributes as much as possible.
Thanks again for the outstanding and very enriching work!
src/message.nim
Outdated
UnacknowledgedMessage* = object | ||
message*: Message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other possible option
UnacknowledgedMessage* = object | |
message*: Message | |
UnacknowledgedMessage* = object of Message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the composition is suitable here as UnacknowledgedMessage is not a type of Message, but rather contains a Message plus metadata for tracking acknowledgments ?
lamportTimestamp*: int64 | ||
causalHistory*: seq[MessageID] | ||
channelId*: ChannelID | ||
content*: seq[byte] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expected content and how are we going to link the WakuMessage
with that Message
type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content field in Message represents the application-level payload as raw bytes. This maintains separation between the various layers. Here is a line from the API spec:
"Message Wrapping: We’ll extend our message structure to include reliability information such as Lamport timestamps and causal history. This wrapping occurs before encryption, effectively extending the application protocol."
This PR adds the rolling bloom filter to the SDS API, that is built upon bloom.nim. This PR also adds protobuf for message, bloom filter serialization and deserialization, definitions & utility functions for the ReliabilityManager. The core functions and unit tests for ReliabilityManager would be in a subsequent PR.
Would greatly appreciate nim related review and comments from the nwaku team !!
This PR is a part of the work towards nim-sds API deliverable