Re-word introduction for non-blocking communication episode (#51)

* improve the flow and structure of intro, wrt changes in communicate-modes branch * move naming conventions further up * spelling and grammar
Southampton-RSG-Training · Jul 26, 2024 · dfab7cc · dfab7cc
1 parent 287807c
commit dfab7cc
Showing 1 changed file with 45 additions and 43 deletions.
diff --git a/_episodes/06-non-blocking-communication.md b/_episodes/06-non-blocking-communication.md
@@ -18,37 +18,54 @@ keypoints:
 ---
 
 In the previous episodes, we learnt how to send messages between two ranks or collectively to multiple ranks. In both
-cases, we used blocking communication functions which meant our program wouldn't progress until data had been sent and
-received successfully. It takes time, and computing power, to transfer data into buffers, to send that data around (over
-the network) and to receive the data into another rank. But for the most part, the CPU isn't actually doing anything.
+cases, we used blocking communication functions which meant our program wouldn't progress until the communication had
+completed. It takes time and computing power to transfer data into buffers, to send that data around (over the
+network) and to receive the data into another rank. But for the most part, the CPU isn't actually doing much at all
+during communication, when it could still be number crunching.
 
 ## Why bother with non-blocking communication?
 
-Non-blocking communication is a communication mode, which allows ranks to continue working on other tasks, whilst data
-is transferred in the background. When we use blocking communication, like `MPI_Send()`, `MPI_Recv()`, `MPI_Reduce()`
-and etc, execution is passed from our program to MPI and is not passed back until the communication has finished. With
-non-blocking communication, the communication beings and control is passed back immediately. Whilst the data is
-transferred in the background, our application is free to do other work. This ability to *overlap* computation and
-communication is absolutely critical for good performance for many HPC applications. The CPU is used very little when
-communicating data, so we are effectively wasting resources by not using them when we can. With good use of non-blocking
-communication, we can continue to use the CPU whilst communication happens and, at the same time, hide/reduce some of
-the communication overhead by overlapping communication and computation.
-
-Reducing the communication overhead is incredibly important for the scalability of HPC applications, especially when we
-use lots of ranks. As the number of ranks increases, the communication overhead to talk to every rank, naturally, also
-increases. Blocking communication limits the scalability of our MPI applications, as it can, relatively speaking, take a
-long time to talk to lots of ranks. But since with non-blocking communication ranks don't sit around waiting for a
-communication operation to finish, the overhead of talking to lots of reduced. The asynchronous nature of non-blocking
-communication makes it more flexible, allowing us to write more sophisticated and performance communication algorithms.
-
-All of this comes with a price. Non-blocking communication is more difficult to use *effectively*, and oftens results in
-more complex code. Not only does it result in more code, but we also have to think about the structure and flow of our
-code in such a way there there is *other* work to do whilst data is being communicated. Additionally, whilst we usually
-expect non-blocking communication to improve th performance, and scalability, of our parallel algorithms, it's not
-always clear cut or predictable if it can help. If we are not careful, we may end up replacing blocking communication
-overheads with synchronization overheads. For example, if one rank depends on the data of another rank and there is no
-other work to do, that rank will have to wait around until the data it needs is ready, as illustrated in the diagram
-below.
+Non-blocking communication is communication which happens in the background. So we don't have to let any CPU cycles go
+to waste! If MPI is dealing with the data transfer in the background, we can continue to use the CPU in the foreground
+and keep doing tasks whilst the communication completes. By *overlapping* computation with communication, we hide the
+latency/overhead of communication. This is critical for lots of HPC applications, especially when using lots of CPUs,
+because, as the number of CPUs increases, the overhead of communicating with them all also increases. If you use
+blocking synchronous sends, the time spent communicating data may become longer than the time spent creating data to
+send! All non-blocking communications are asynchronous, even when using synchronous sends, because the communication
+happens in the background, even though the communication cannot complete until the data is received.
+
+> ## So, how do I use non-blocking communication?
+>
+> Just as with buffered, synchronous, ready and standard sends, MPI has to be programmed to use either blocking or
+> non-blocking communication. For almost every blocking function, there is a non-blocking equivalent. They have the same
+> name as their blocking counterpart, but prefixed with "I". The "I" stands for "immediate", indicating that the
+> function returns immediately and does not block the program. The table below shows some examples of blocking functions
+> and their non-blocking counterparts.
+>
+> | Blocking        | Non-blocking     |
+> | --------------- | ---------------- |
+> | `MPI_Bsend()`   | `MPI_Ibsend()`   |
+> | `MPI_Barrier()` | `MPI_Ibarrier()` |
+> | `MPI_Reduce()`  | `MPI_Ireduce()`  |
+>
+> But, this isn't the complete picture. As we'll see later, we need to do some additional bookkeeping to be able to use
+> non-blocking communications.
+>
+{: .callout}
+
+By effectively utilizing non-blocking communication, we can develop applications that scale significantly better during
+intensive communication. However, this comes with the trade-off of both increased conceptual and code complexity. Since
+non-blocking communication doesn't keep control until the communication finishes, we don't actually know if a
+communication has finished unless we check; this is usually referred to as synchronisation, as we have to keep ranks in
+sync to ensure they have the correct data. So whilst our program continues to do other work, it also has to keep pinging
+to see if the communication has finished, to ensure ranks are synchronised. If we check too often, or don't have enough
+tasks to "fill in the gaps", then there is no advantage to using non-blocking communication and we may replace
+communication overheads with time spent keeping ranks in sync! It is not always clear cut or predictable if non-blocking
+communication will improve performance. For example, if one ranks depends on the data of another, and there are no tasks
+for it to do whilst it waits, that rank will wait around until the data is ready, as illustrated in the diagram below.
+This essentially makes that non-blocking communication a blocking communication. Therefore unless our code is structured
+to take advantage of being able to overlap communication with computation, non-blocking communication adds complexity to
+our code for no gain.
 
 <img src="fig/non-blocking-wait-data.png" alt="Non-blocking communication with data dependency" height="250"/>
 
@@ -99,21 +116,6 @@ The arguments are identical to `MPI_Send()`, other than the addition of the `*re
 as an *handle* (because it "handles" a communication request) which is used to track the progress of a (non-blocking)
 communication.
 
-> ## Naming conventions
->
-> Non-blocking functions have the same name as their blocking counterpart, but prefixed with "I". The "I" stands for
-> "immediate", indicating that the function returns immediately and does not block the program whilst data is being
-> communicated in the background. The table below shows some examples of blocking functions and their non-blocking
-> counterparts.
->
-> | Blocking | Non-blocking|
-> | -------- | ----------- |
-> | `MPI_Bsend()` | `MPI_Ibsend()` |
-> | `MPI_Barrier()` | `MPI_Ibarrier()` |
-> | `MPI_Reduce()` | `MPI_Ireduce()` |
->
-{: .callout}
-
 When we use non-blocking communication, we have to follow it up with `MPI_Wait()` to synchronise the
 program and make sure `*buf` is ready to be re-used. This is incredibly important to do. Suppose we are sending an array
 of integers,