The amount of traffic in data centers is growing exponentially and it is not expected to stop growing any time soon. This brings about a vast amount of advancements in the networking field. Network interface throughputs supported today are in the range of 40Gbps and higher. On the other hand, such high interface throughputs do not guarantee higher packet processing speeds which are limited due to the overheads imposed by the architecture of the network stack. Nevertheless, there is a great need for a speedup in the forwarding engine, which is the most important part of a high-speed router. For this reason, many software-based and hardware-based solutions have emerged recently with a goal of increasing packet processing speeds. The networking stack of an operating system is not conceived for high-speed networking applications but rather for general purpose communications.
In this thesis, we investigate various approaches that strive to improve packet processing performance on server-class network hosts, either by using software, hardware, or the combination of the two. Some of the solutions are based on the Click modular router which offloads its functions on different types of hardware like GPUs, FPGAs or different cores among different servers with parallel execution. Furthermore, we explore other software solutions which are not based on the Click modular router. We compare software and hardware packet processing solutions based on different criteria and we discuss their integration possibilities in virtualized environments, their constraints and their requirements. As our first contribution, we propose a resilient and highly performant fabric network architecture. Our goal is to build a layer 2 mesh network that only uses directly connected hardware acceleration cards that perform packet processing instead of routers and switches. We have decided to use the TRILL protocol for the communication between these smart NICs as it provides a better utilization of network links while also providing least-cost pair-wise data forwarding. The data plane packet processing is offloaded on a programmable hardware with parallel processing capability. Additionally, we propose to use the ODP API so that packet processing application code can be reused by any other packet processing solution that supports the ODP API.
As our second contribution, we designed a data plane of the TRILL protocol on the MPPA (Massively Parallel Processor Array) smart NIC which supports the ODP API. Our experimental results show that we can process TRILL frames at full-duplex line-rate (up to 40Gbps) for different packet sizes while reducing latency.
As our third contribution, we provide a mathematical analysis of the impact of different network topologies on the control plane’s load. The data plane packet processing is performed on the MPPA smart NICs. Our goal is to build a layer 2 mesh network that only uses directly connected smart NIC cards instead of routers and switches. We have considered various network topologies and we compared their loads induced by the control plane traffic. We have also shown that hypercube topology is the most suitable for our PoP data center use case because it does not have a high control plane load and it has a better resilience than fat-tree while having a shorter average distance between the nodes.