Modern mobile communication networks and new service applications are deployed on cloud-native platforms. Kubernetes (K8s) is the de facto distributed operating system for container orchestration, ...and the extended version of the Berkeley Packet Filter (eBPF)- in the Linux (and MS Windows) kernel- is fundamentally changing the approach to cloud-native networking, security, and observability. In this paper, we introduce what eBPF is, its potential for Telco cloud, and review some of the most promising pricing and billing models applied to this revolutionary operating system (OS) technology. These models include schemes based on a data source usage model or the number of eBPF agents deployed on the network, linked to specific eBPF modules. These modules encompass network observability , runtime security , and power dissipation monitoring. Next, we present our eBPF platform, named Sauron in this work, and demonstrate how eBPF allows us to write custom code and dynamically load eBPF programs into the kernel. These programs enable us to estimate the energy consumption of cloud-native functions, derive performance counters and gauges for transport networks, 5G applications, and non-access stratum protocols. Additionally, we can detect and respond to unauthorized access to cloud-native resources in real-time using eBPF. Our experimental results demonstrate the technical feasibility of eBPF in achieving highly performant monitoring, observability, and security tooling for current mobile networks (5G, 5G Advanced) as well as future networks (6G and beyond).
Today, more and more enterprises are embarking on a digital transformation where most of their applications are hosted in the Cloud. As a result, a reliable Wide Area Network (WAN) has become a ...primary need to interconnect their distributed branch offices and data centers that accommodate those applications. Software-Defined Wide Area Network (SD-WAN) represents the most promising technology solution for next-generation enterprise networks, being able to increase network agility and reduce costs. In this paper, we present an experimental SD-WAN solution capable of running and optimizing delay-sensitive high-priority services, such as real-time video streaming, while minimizing downtime caused by network failures. This solution comprises a monitoring and a traffic engineering system for SD-WAN. The first consists of a Transport-layer Passive Monitoring (TPM) system based on extended Berkeley Packet Filter (eBPF) technology with the goal of monitoring TCP flows; the second consists of an application, running inside the SD-WAN controller, with the goal of orchestrating the network traffic in consideration of the monitoring measurements by ensuring rapid recovery and resilience in case of unexpected congestion events. We validate our solution over two SD-WAN testbeds: the first is hosted in our laboratory at Politecnico di Milano, while the second is deployed in a municipal network of an Italian city. Results show that our SD-WAN solution can increase the overall service availability while meeting the stringent QoS requirements of delay-sensitive services.s
In this paper, we explain that container engines are strengthening their isolation mechanisms. Therefore, non-intrusive monitoring becomes a must-have for the performance analysis of containerized ...user-space application in production environments. After a literature review and background of Linux subsystems and container isolation concepts, we present our lessons learned of using the extended Berkeley packet filter to monitor and profile performance. We carry out the profiling and tracing of several Interledger connectors using two full-fledged implementations of the Interledger protocol specifications.
Open-source software and its components are widely used in various products, solutions, and applications, even in closed-source. Majority of them are made on Linux or Unix based systems. Netfilter ...framework is one of the examples. It is used for packet filtering, load-balancing, and many other manipulations with network traffic. Netfilter based packet filter iptables has been most common firewall tool for Linux systems for more than two decades. Successor of iptables – nftables was introduced in 2014. It was designed to overcome various iptables limitations. However, it hasn’t received wide popularity and transition is still ongoing. In recent years researchers and developers around the world are searching for solution to increase performance of packet processing tools. For that purpose, many of them trying to utilize eBPF (Extended Berkeley Packet Filter) with XDP (Express Data Path) data path. This paper focused on analyzing Linux OS packet filters and comparing their performances in different scenarios.
Modern systems generate a massive amount of logs to detect and diagnose system faults, which incurs expensive storage costs and runtime overhead. After investigating real-world production logs, we ...observe that most of the logging overhead is due to a small number of log templates, referred to as log hotspots. Therefore, we conduct a systematical study about log hotspots in an industrial system WeChat, which motivates us to identify log hotspots and reduce them on the fly. In this paper, we propose LogReducer, a non-intrusive and language-independent log reduction framework based on eBPF (Extended Berkeley Packet Filter), consisting of both online and offline processes. After two months of serving the offline process of LogReducer in WeChat, the log storage overhead has dropped from 19.7 PB per day to 12.0 PB (i.e., about a 39.08% decrease). Practical implementation and experimental evaluations in the test environment demonstrate that the online process of LogReducer can control the logging overhead of hotspots while preserving logging effectiveness. Moreover, the log hotspot handling time can be reduced from an average of 9 days in production to 10 minutes in the test with the help of LogReducer.
Technologies such as microservices, containerization and Kubernetes in cloud-native environments make large-scale application delivery easier and easier, but problem troubleshooting and fault ...location in the face of massive applications is becoming more and more complex. Currently, the data collected by the mainstream monitoring technologies based on sampling is difficult to cover all anomalies, and the kernel's lack of observability also makes it difficult to monitor more detailed data in container environments such as the Kuber-netes platform. In addition, most of the current technology solutions use tracing and application performance monitoring tools (APMs), but these technologies limit the language used by the application and need to be invasive into the application code, many scenarios require more general network performance detection diagnostic methods that do not invade the user application. In this paper, we propose to introduce network monitoring at the kernel level below the application for the Kubernetes cluster in Alibaba container service. By nonintrusive collection of user application L7/L4 layer network protocol interaction information based on eBPF, data collection of more than 10M throughputs per second can be achieved without modifying any kernel and application code, while the impact on the system application is less than 1%. It also uses machine learning methods to analyze and diagnose application network performance and problems, analyze network performance bottlenecks and locate specific instance information for different applications, and realize protocol-independent network performance problem location and analysis.
Today, the advances in virtualization technologies are dramatically changing network architecture as well as operation style. In particular, cloud-native network functions (CNFs) provide scalability, ...flexibility, and lightness to network services, which can be leveraged to satisfy vehicle application requirements in the 5G and beyond 5G era. However, their microservice architecture causes a new challenge in the operation of cloud-native networks due to the larger number of compositions and dynamic topology changes. To address this situation, artificial intelligence (AI)/machine learning (ML) technologies are expected to support operational tasks. One of these tasks is failure prediction that is a key enabler for proactive network operation to minimize the failure impact. For reliable failure prediction, not only ML methods but also data used for ML training should be prepared with a fine granularity. In this paper, we utilize the extended berkeley packet filter (eBPF) to collect fine-grained information from the virtual infrastructure where CNFs are deployed. By training a long short-term memory (LSTM) model on the collected data, we develop the failure prediction model that can provide the future transition of key performance indicators (KPIs) in a cloud-native 5G core network (5GC). Our experiment on the lab environment shows that the prediction model trained on eBPF data outperforms other models trained without them.
Cloud data centers are increasingly adopting the Software-Defined Networking (SDN) technologies for their underlying connection and communications. However, as a critical part of daily operations and ...management of such data centers, the network measurement is essential but has often been constrained by the available resources in the traditional network devices. Thus, how to properly balance the resource consumption while maintain timely and accurate measurement remains a challenge to data center systems. Recent advances in Software-Defined Networking (SDN) have enabled flexible and programmable network measurement, which is referred to as Software Defined Measurement (SDM). A promising trend for SDM is to conduct network traffic measurement on widely deployed Open vSwitches (OVS) in data centers. However, little attention has been paid to the design options for conducting traffic measurement on the OVS. In this study, we set to explore different designs and investigate the corresponding trade-offs among resource consumption, measurement accuracy, implementation complexity, and impact on switching speed. Through extensive experiments and comparisons, we quantitatively show the various trade-offs that the different schemes strike to balance, and demonstrate the feasibility of instrumenting OVS with monitoring capabilities. These results provide valuable insights into which design will best serve different measurement and monitoring needs.
By decoupling network functions from dedicated, proprietary hardware network devices, Network Function Virtualization (NFV) allows building Virtual Network Functions (VNFs) that can run on standard, ...commodity servers to reduce cost and gain flexibility in network deployment, operation, and management. However, building VNFs with high-throughput and low-latency is a big challenge. In this paper, we propose eVNF - a hybrid fast-slow path architecture to build and accelerate VNFs with eXpress Data Path (XDP), which is a Linux kernel framework that enables high performance and programmable network processing. The programmability of XDP is limited to ensure kernel safety, thus causing difficulties when using XDP to accelerate VNFs. eVNF solves this problem by taking a hybrid approach: leave the simple but critical tasks inside the kernel with XDP, and let complex tasks be processed outside XDP, e.g., in user-space. With the hybrid architecture, eVNF allows building fast and flexible VNFs. We applied eVNF to build four prototype VNFs: Flow Monitoring (eFM), Firewall (eFW), Deep Packet Inspection (eDPI), and Load Balancer (eLB). These VNFs are evaluated individually and in service function chains (SFCs) using OpenStack. Our experiments showed that eVNF can significantly improve service throughput as well as reduce latency and CPU usage. eVNF-based VNFs also can scale out with the number of CPU cores and can combine with Open vSwitch - Data Plane Development Kit (OvS-DPDK) for better performance.