-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add test reporting doc to benchmarks dir #3238
base: master
Are you sure you want to change the base?
Conversation
This is a draft to get inputs on the formatting and content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I've added a couple of comments and suggestions on what other sections to add.
2362ed1
to
e01a61e
Compare
@fryorcraken I wonder if the TL;DR section is not too wordy? Is the requirement not to have it be something very short that can be read quickly and easily remembered such as:
and then if the reader wants more info (such as the network size and message rate for the simulations where the above values were obtained, they can look at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments. :)
|
||
> ## TL;DR | ||
> | ||
> - libp2p bandwidth usage fluctuates between 5 and 15 KB/s for topologies of up to 1000 nodes, with average bandwidth usage at **10 KB/s**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the bandwidth numbers to make sense, we need to add the message rate and size. Perhaps just mentioning the average and max bandwidth is enough?
This is expected for Relay networks and the slight fluctuation could be due to simulation artifacts or chance differences in routing or connectivity between test runs. | ||
> - The average time for a message to propagate to 100% of nodes in topologies of up to 2000 Relay nodes is **0.4s**. | ||
> - The average per-node bandwidth usage of the discv5 protocol is **8 KB/s** for incoming traffic and **7.4 KB/s** for outgoing traffic. | ||
This is for a network with 100 continuously online nodes, sending 1KB messages at 1s intervals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the only relevant detail for discv5 here is the number of nodes and not the message size or rate. However, do we have some understanding if the discv5 bandwidth usage does fluctuate much with number of nodes? If not, we can leave out the number of nodes too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the results here: https://www.notion.so/Measure-DiscV5-bandwidth-with-Waku-discovery-1698f96fb65c80659fa1fbfdac49b1ef?pvs=4#16a8f96fb65c8060ac93dd35e2b9c464
there is some fluctuation when comparing the bandwidth usage for varying total nodes in the network.
The data from the referenced test includes the data from the start of the simulation, which we have not yet determined how much it impacts the bandwidth usage results.
Because of this I think it's best to keep the total number of nodes in, until we get more information, if you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would make sense. I would still suggest leaving out the message size/rate stats as it describes a domain separate from discv5.
| [Relay](https://www.notion.so/Waku-regression-testing-v0-34-1618f96fb65c803bb7bad6ecd6bafff9) (1000 nodes) | 0.05 | 1.6 | | ||
| [Mixed](https://www.notion.so/Mixed-environment-analysis-1688f96fb65c809eb235c59b97d6e15b) (210 nodes) | 0.0125 | 0.007 | | ||
| [Non-persistent Relay](https://www.notion.so/High-Churn-Relay-Store-Reliability-16c8f96fb65c8008bacaf5e86881160c) (510 nodes)| 0.0125 | 0.25 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add very brief description of what "Relay", "Mixed" and "Non-persistent Relay" means, so that a reader doesn't have to click the links to get an intuitive understanding.
|
||
## Testing | ||
### DST | ||
The VAC DST team performs regression testing on all new **nwaku** releases, comparing performance with previous versions. They simulate large Waku networks with a variety of network and protocol configurations that are representative of real-world usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The VAC DST team performs regression testing on all new **nwaku** releases, comparing performance with previous versions. They simulate large Waku networks with a variety of network and protocol configurations that are representative of real-world usage. | |
The VAC DST team performs regression testing on all new **nwaku** releases, comparing performance with previous versions. | |
They simulate large Waku networks with a variety of network and protocol configurations that are representative of real-world usage. |
Semantic breaks, here and further down. :)
|
||
> ## TL;DR | ||
> | ||
> - libp2p bandwidth usage fluctuates between 5 and 15 KB/s for topologies of up to 1000 nodes, with average bandwidth usage at **10 KB/s**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know the configured degree D
here - I think this is just the default of 6
? Perhaps not worth mentioning if this is a "well-known fact" about Waku.
So, on second thought I think we can simplify this TL;DR, focus on the critical conclusion and use less domain terms. For example, our first sentence suggests that we have concluded an average of 10 KB/s only up to 1000 nodes, but in the next sentence we say roughly the same but this time for up to 2000 nodes. I'd suggest something like:
Waku bandwidth (minus traffic related to discv5 Discovery) averages ~10KB/s for a message injection rate of X KB/s for any topology size* (*confirmed up to 2000 nodes).
I think X is 1KB/s (i.e. 1KB message every 1 second)?
Description
This PR is a first pass at adding a nwaku test summary page which aims to provide a quick reference for anyone implementing the waku protocol using nwaku to see the expected performance as well as have quick access to test reports.
Changes
docs/benchmarks/test-results-summary.md
How to test