You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am interested in the implementation of pipelined RDMA operations. However, after reading the source code, I realized that doorbell batching is the solution. I want to confirm the actual implementation since I have been informed that doorbell batching cannot reduce network roundtrips doorbell batching introduction.
The text was updated successfully, but these errors were encountered:
"Pipelining" on the network is unrelated to doorbell batching. (Doorbell will only affect what happens on the PCIe bus. Also, note that we disabled doorbell in the experiments as it increased the latency and did not significantly change throughput in our experiments)
When sending two RDMA requests to the same reliable connection, if you don't wait to have polled for the completion of the first before sending the second, the second request can be executed "in parallel" to the first, and the ordering guarantees depends on the type of operation. For example, a CAS operation is guaranteed to happen after any previous write sent to the same connection. This ordering is ensured by the remote RDMA NIC and thus does not incur any additional network roundtrips. The operations are thus effectively pipelined.
So, TL;DR: we don't do anything fancy, we just sent all the requests at once and rely on the ordering garanties of RDMA's reliable connections.
Hope this clarifies your understanding.
I am interested in the implementation of pipelined RDMA operations. However, after reading the source code, I realized that doorbell batching is the solution. I want to confirm the actual implementation since I have been informed that doorbell batching cannot reduce network roundtrips doorbell batching introduction.
The text was updated successfully, but these errors were encountered: