Groupe d’études et de recherche en analyse des décisions

G-2016-72

Fast failure detection and recovery in SDN with stateful data plane

, , , et

When dealing with node or link failures in Software Defined Networking (SDN), the network capability to establish an alternative path depends on controller reachability and on the round-trip times (RTTs) between controller and involved switches. Moreover, current SDN data plane abstractions for failure detection (e.g. OpenFlow "Fast-failover") do not allow programmers to tweak switches' detection mechanism, thus leaving SDN operators relying on proprietary management interfaces (when available) to achieve guaranteed detection and recovery delays. We propose SPIDER, an OpenFlow-like pipeline design that provides i) a detection mechanism based on switches' periodic link probing and ii) fast reroute of traffic flows even in the case of distant failures, regardless of controller availability. SPIDER is based on stateful data plane abstractions such as OpenState or P4, and it offers guaranteed short (few milliseconds or less) failure detection and recovery delays, with a configurable trade off between overhead and failover responsiveness. We present here the SPIDER pipeline design, behavioral model, and analysis on flow tables' memory impact. We also implemented and experimentally validated SPIDER using OpenState (an OpenFlow 1.3 extension for stateful packet processing) and P4, showing numerical results on its performance in terms of recovery latency and packet losses.

, 21 pages