Faa$T Romero, Francisco; Chaudhry, Gohar Irfan; Goiri, Íñigo ...
Proceedings of the ACM Symposium on Cloud Computing,
11/2021
Conference Proceeding
Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS ...platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short: it disregards the widely different characteristics of FaaS applications, does not scale the cache based on data access patterns, or requires changes to applications. To address these limitations, we present Faa$T, a transparent auto-scaling distributed cache for serverless applications. Each application gets its own cache. After a function executes and the application becomes inactive, the cache is unloaded from memory with the application. Upon reloading for the next invocation, Faa$T pre-warms the cache with objects likely to be accessed. In addition to traditional compute-based scaling, Faa$T scales based on working set and object sizes to manage cache space and I/O bandwidth. We motivate our design with a comprehensive study of data access patterns on Azure Functions. We implement Faa$T for Azure Functions, and show that Faa$T can improve performance by up to 92% (57% on average) for challenging applications, and reduce cost for most users compared to state-of-the-art caching systems, i.e. the cost of having to stand up additional serverful resources.
Irfan A. Chaudhry was a principal member of the technical staff for IC design at Maxim Integrated Products at the time this article was written but is no longer with the company.
Photonics is a promising technology to accelerate Deep Neural Networks as it can use optical interconnects to reduce data movement energy and it enables low-energy, high-throughput optical-analog ...computations. To realize these benefits in a full system (accelerator + DRAM), designers must ensure that the benefits of using the electrical, optical, analog, and digital domains exceed the costs of converting data between domains. Designers must also consider system-level energy costs such as data fetch from DRAM. Converting data and accessing DRAM can consume significant energy, so to evaluate and explore the photonic system space, there is a need for a tool that can model these full-system considerations. In this work, we show that similarities between Compute-in-Memory (CiM) and photonics let us use CiM system modeling tools to accurately model photonics systems. Bringing modeling tools to photonics enables evaluation of photonic research in a full-system context, rapid design space exploration, co-design, and comparison between systems. Using our open-source model, we show that cross-domain conversion and DRAM can consume a significant portion of photonic system energy. We then demonstrate optimizations that reduce conversions and DRAM accesses to improve photonic system energy efficiency by up to 3x.
This report explores the use of kernel-bypass networking in FaaS runtimes and demonstrates how using Junction, a novel kernel-bypass system, as the backend for executing components in faasd can ...enhance performance and isolation. Junction achieves this by reducing network and compute overheads and minimizing interactions with the host operating system. Junctiond, the integration of Junction with faasd, reduces median and P99 latency by 37.33% and 63.42%, respectively, and can handle 10 times more throughput while decreasing latency by 2x at the median and 3.5 times at the tail.
Function-as-a-Service (FaaS) serverless computing enables a simple programming model with almost unbounded elasticity. Unfortunately, current FaaS platforms achieve this flexibility at the cost of ...lower performance for data-intensive applications compared to a serverful deployment. The ability to have computation close to data is a key missing feature. We introduce Palette load balancing, which offers FaaS applications a simple mechanism to express locality to the platform, through hints we term "colors". Palette maintains the serverless nature of the service - users are still not allocating resources - while allowing the platform to place successive invocations related to each other on the same executing node. We compare a prototype of the Palette load balancer to a state-of-the-art locality-oblivious load balancer on representative examples of three applications. For a serverless web application with a local cache, Palette improves the hit ratio by 6x. For a serverless version of Dask, Palette improves run times by 46% and 40% on Task Bench and TPC-H, respectively. On a serverless version of NumS, Palette improves run times by 37%. These improvements largely bridge the gap to serverful implementation of the same systems.
Memory-harvesting VMs in cloud platforms Fuerst, Alexander; Novaković, Stanko; Goiri, Íñigo ...
Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems,
02/2022
Conference Proceeding
loud platforms monetize their spare capacity by renting “Spot” virtual machines (VMs) that can be evicted in favor of higher-priority VMs. Recent work has shown that resource-harvesting VMs are more ...effective at exploiting spare capacity than Spot VMs, while also reducing the number of evictions. However, the prior work focused on harvesting CPU cores while keeping memory size fixed. This wastes a substantial monetization opportunity and may even limit the ability of harvesting VMs to leverage spare cores. Thus, in this paper, we explore memory harvesting and its challenges in real cloud platforms, namely its impact on VM creation time, NUMA spanning, and page fragmentation. We start by characterizing the amount and dynamics of the spare memory in Azure. We then design and implement memory-harvesting VMs (MHVMs), introducing new techniques for memory buffering, batching, and pre-reclamation. To demonstrate the use of MHVMs, we also extend a popular cluster scheduling framework (Hadoop) and a FaaS platform to adapt to them. Our main results show that (1) there is plenty of scope for memory harvesting in real platforms; (2) MHVMs are effective at mitigating the negative impacts of harvesting; and (3) our extensions of Hadoop and FaaS successfully hide the MHVMs’ varying memory size from the users’ data-processing jobs and functions. We conclude that memory harvesting has great potential for practical deployment and users can save up to 93% of their costs when running workloads on MHVMs.
A CODEC, fabricated in 3.3-V CMOS, provides the low-voltage transmitter and receiver interfaces between the modem digital signal processing engine and the high-voltage hybrid circuitry for either the ...central office or the remote terminal ends of the subscriber loop, configurable by metal mask option. Extensive use of digital interpolation filters in the transmitter, decimation filters in the receiver, and oversampled data converters minimize the complexity of analog filters. On-chip filtering and 14-bit data converters support echo-canceling modems without requiring external filters. With additional external filters, frequency-division duplexing is also supported. The die area is 67.5 mm/sup 2/. The power dissipation in the central office and remote terminal are 600 and 760 mW, respectively.
High-coverage testing of softwarized networks Prabhu, Santhosh; Chaudhry, Gohar Irfan; Godfrey, Brighten ...
Proceedings of the 2018 Workshop on Security in Softwarized Networks: Prospects and Challenges,
08/2018
Conference Proceeding
Open access
Network operators face a challenge of ensuring correctness as networks grow more complex, in terms of scale and increasingly in terms of diversity of software components. Network-wide verification ...approaches can spot errors, but assume a simplified abstraction of the functionality of individual network devices, which may deviate from the real implementation. In this paper, we propose a technique for high-coverage testing of end-to-end network correctness using the real software that is deployed in these networks. Our design is effectively a hybrid, using an explicit-state model checker to explore all network-wide execution paths and event orderings, but executing real software as subroutines for each device. We show that this approach can detect correctness issues that would be missed both by existing verification and testing approaches, and a prototype implementation suggests the technique can scale to larger networks with reasonable performance.
Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS ...platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short: it disregards the widely different characteristics of FaaS applications, does not scale the cache based on data access patterns, or requires changes to applications. To address these limitations, we present Faa\$T, a transparent auto-scaling distributed cache for serverless applications. Each application gets its own Faa\$T cache. After a function executes and the application becomes inactive, the cache is unloaded from memory with the application. Upon reloading for the next invocation, Faa\$T pre-warms the cache with objects likely to be accessed. In addition to traditional compute-based scaling, Faa\$T scales based on working set and object sizes to manage cache space and I/O bandwidth. We motivate our design with a comprehensive study of data access patterns in a large-scale commercial FaaS provider. We implement Faa\$T for the provider's production FaaS platform. Our experiments show that Faa\$T can improve performance by up to 92% (57% on average) for challenging applications, and reduce cost for most users compared to state-of-the-art caching systems, i.e. the cost of having to stand up additional serverful resources.