Implementing SurgeProtector for Flow Reassembly #16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This changelist implements SurgeProtector in the context of Pigasus's FPGA-based flow reassembler. SurgeProtector is an adversarial scheduling framework that protects vulnerable NFs against Algorithmic Complexity Attacks (ACAs), a potent class of Denial-of-Service (DoS) attacks. A brief overview of ACAs and the framework itself is provided in the README of the SurgeProtector source directory.
The most significant changes to the Pigasus datapath (and, more generally, the repo) are as follows:
Reassembly scheduler: Previously, all out-of-order packets were directly handed off to the reassembler by the flow table in the order that they arrived (i.e., packets were served FCFS). With this change, we interpose a scheduler between the flow table and the reassembler, which decides the order in which packets will be served. The high-level operation is as follows. First, the scheduler continually receives out-of-order packets from the flow table and buffers them in per-flow queues (implemented using singly linked-lists). In parallel, the scheduler chooses an active, out-of-order flow based on a user-specified scheduling policy (configured at compile time), and dequeues the packet at the head of the corresponding flow queue; this packet is then sent to flow reassembly to be served. When the reassembler signals completion, the scheduler propagates any updates regarding flow state to the flow table (more details on this below), and the process repeats.
The scheduler currently supports two scheduling policies: FCFS (the policy implicitly used in Pigasus today), and WSJF (the policy underlying SurgeProtector). The policy can be configured by setting the
SCHEDULER_REASSEMBLY_POLICY
parameter in struct_s.sv (defaults to FCFS).Decoupled in-order and out-of-order flow state. Previously, the flow table was responsible for managing state for both in-order and out-of-order (OOO) flows. This change offloads the responsibility of managing out-of-order flow state (i.e., next expected PSN, pointer to the head of the OOO linked-list, etc.) to a separate data-structure called the OOO flow context table (OOO FCT) (implemented in the scheduler). Every out-of-order flow is assigned an OOO flow ID (stored as part of the flow context in the primary flow table), which in turn serves as an index into the OOO flow context table. The updated architecture has two advantages: (1) We can choose to support a different (fewer) number of OOO flows than the total flow count (saving BRAM in the process because not every flow now needs to be associated with an OOO flow context); and, more importantly, (2) It reduces contention for the primary flow table. Since OOO flow state is decoupled from the global flow context, the slow path needs to update the primary flow table far less frequently; in particular, reassembly updates are propagated to the primary flow table only under two conditions: when the next expected PSN changes (e.g. the flow becomes in-order or some packets can be released), or the flow is dropped (more details on this below).
Flow Garbage Collection (GC). Previously, Pigasus's reassembler would permanently stall once all its linked-list entries were occupied, putting the slow path in an irrecoverable state. This change implements flow garbage collection in both the scheduler and the reassembler, allowing execution to continue despite memory pressure that would otherwise overwhelm the slow path. GC is orchestrated by the scheduler, which initiates the process when the available memory in either the scheduler or the reassembler falls below some user-specified watermark levels (set here). In order to do this, the scheduler fetches the lowest-priority element among the set of active, out-of-order flows (once again determined by the scheduling policy), then garbage-collects the corresponding flow's entries in (a) the scheduler's flow queue, and (b) the reassembler's out-of-order linked list. Finally, it deallocates the flow context in the primary flow table (i.e. drops the flow).
Test harness. This change also introduces a bash-based test harness that may be used to implement unit- or regression test suites for various Pigasus components. Sources can be found in the testbench directory, and are named as per the modules they seek to test. The testcases themselves are written in SystemVerilog (e.g., here). For instance, here is the output produced by the run_test.sh script corresponding to the SurgeProtector's pipelined_heap module:
Also fixes some corner-case bugs (e.g., fast/slow path data races) that show up when stress-testing the reassembler.
Resources and Timing
The following table depicts resource utilization before this change (Baseline Pigasus), after this change but with SurgeProtector disabled (Scheduler+FCFS), and, finally, with SurgeProtector enabled (Scheduler+SurgeProtector). DSP and eSRAM usage does not change (0 in all cases), so is not reported for the sake of brevity. Compared to the baseline, the delta in resources with the new scheduler and with SurgeProtector enabled is as follows: +2.3% ALMs, +1.3% registers, +1.9% BRAM.
Similarly, the following table depicts the timing achieved by all three designs (based on a single Quartus run). None of the designs closed timing, but the negative slack is small. Also, note that the worst-case path in the WSJF case occurs in the string-matcher instance (unrelated to this change).
Testing
Verified correctness for several configurations of packet buffer sizes (
PKT_NUM
), maximum number of out-of-order flows (MAX_NUM_OOO_FLOWS
), and input traces in simulation. Running the existing testbench with the provided input trace produces the expected result. The newly-added tests (exercising individual SurgeProtector components) all pass, including a new regression test exercising the reassembler on a trace containing 10% adversarial traffic. The existing version of Pigasus produces the following error on this adversarial input (due to the memory exhaustion problem described in the Flow Garbage Collection section above):After this change, the test-case completes for both FCFS and WSJF (albeit with very different throughputs). The figure below depicts the goodput achieved by the two scheduling policies for different attack bandwidths (note that SurgeProtector corresponds to WSJF).
Note: End-to-end hardware tests are still pending. PLEASE DO NOT MERGE THIS CHANGE YET.