scispace - formally typeset
Search or ask a question

Showing papers on "Temporal isolation among virtual machines published in 2023"


Journal ArticleDOI
TL;DR: In this article , the authors proposed a virtualization implementation method on 100Gbps high-speed Field Programmable Gate Array (FPGA) network accelerator card, which uses FPGA accelerator to improve the performance of virtual network devices.
Abstract: Network Functional Virtualization (NFV) is a high-performance network interconnection technology that allows access to traditional network transport devices through virtual network links. It is widely used in cloud computing and other high-concurrent access environments. However, there is a long delay in the introduction of software NFV solutions. Other hardware I/O virtualization solutions don't scale very well. Therefore, this paper proposes a virtualization implementation method on 100Gbps high-speed Field Programmable Gate Array (FPGA) network accelerator card, which uses FPGA accelerator to improve the performance of virtual network devices. This method uses the single root I/O virtualization (SR-IOV) technology to allow 256 virtual links to be created for a single Peripheral Component Interconnect express (PCIe) device. And it supports data transfer with virtual machine (VM) in the way of Peripheral Component Interconnect (PCI) passthrough. In addition, the design also adopts the shared extensible queue management mechanism, which supports the flexible allocation of more than 10,000 queues on virtual machines, and ensures the good isolation performance in the data path and control path. The design provides high-bandwidth transmission performance of more than 90Gbps for the entire network system, meeting the performance requirements of hyperscale cloud computing clusters.

Journal ArticleDOI
TL;DR: In this paper , the authors present Memtrade, a marketplace for disaggregated memory clouds, which allows producer virtual machines (VMs) to lease both their unallocated memory and allocated-but-idle application memory to remote consumer VMs for a limited period of time.
Abstract: We present Memtrade, the first practical marketplace for disaggregated memory clouds. Clouds introduce a set of unique challenges for resource disaggregation across different tenants, including resource harvesting, isolation, and matching. Memtrade allows producer virtual machines (VMs) to lease both their unallocated memory and allocated-but-idle application memory to remote consumer VMs for a limited period of time. Memtrade does not require any modifications to host-level system software or support from the cloud provider. It harvests producer memory using an application-aware control loop to form a distributed transient remote memory pool with minimal performance impact; it employs a broker to match producers with consumers while satisfying performance constraints; and it exposes the matched memory to consumers through different abstractions. As a proof of concept, we propose two such memory access interfaces for Memtrade consumers -- a transient KV cache for specified applications and a swap interface that is application-transparent. Our evaluation shows that Memtrade provides significant performance benefits for consumers (improving average read latency up to 2.8X) while preserving confidentiality and integrity, with little impact on producer applications (degrading performance by less than 2.1%).

Proceedings ArticleDOI
08 May 2023
TL;DR: In this article , the authors proposed vTMM, a tiered memory management system specifically designed for virtualization, which automatically determines page hotness and migrates pages between fast and slow memory.
Abstract: The memory demand of virtual machines (VMs) is increasing, while the traditional DRAM-only memory system has limited capacity and high power consumption. The tiered memory system can effectively expand the memory capacity and increase the cost efficiency. Virtualization introduces new challenges for memory tiering, specifically enforcing performance isolation, minimizing context switching, and providing resource overcommit. However, none of the state-of-the-art designs consider virtualization and thus address these challenges; we observe that a VM with tiered memory incurs up to a 2× slowdown compared to a DRAM-only VM. This paper proposes vTMM, a tiered memory management system specifically designed for virtualization. vTMM automatically determines page hotness and migrates pages between fast and slow memory to achieve better performance. A key insight in vTMM is to leverage the unique system characteristics in virtualization to meet the above challenges. Specifically, vTMM tracks memory accesses with page-modification logging (PML) and a multi-level queue design. Next, vTMM quantifies the page "temperature" and makes a fine-grained page classification with bucket-sorting. vTMM performs page migration with PML while providing resource overcommit by transparently resizing VM memory through the two-dimensional page tables. In combination, the above techniques minimize overhead, ensure performance isolation and provide dynamic memory partitioning to improve the overall system performance. We evaluate vTMM on a real DRAM+NVM system and a simulated CXL-Memory system. The results show that vTMM outperforms NUMA balancing, Intel Optane memory mode and Nimble (an OS-level tiered memory management system) for VM tiered memory management. Multi-VM co-running results show that vTMM improves the performance of a DRAM+NVM system by 50%--140% and a CXL-Memory system by 16% -- 40%, respectively.

Proceedings ArticleDOI
05 Jun 2023
TL;DR: In this paper , the authors proposed separate caching and slicing execution to avoid the bandwidth-sensitive tenants affecting latency-sensitive users in the transmit-side (TX) RNIC architecture, and added isolated backpressure and adaptive weighted round-robin scheduling to ensure the bandwidth sensitive tenants share the bandwidth equally.
Abstract: Remote Direct Memory Access (RDMA) is a promising technology for achieving low latency and high bandwidth access to remote memory. However, performance interference exists when multiple tenants share an RDMA Network Interface Card (RNIC) in the cloud environment. Although some initial studies have investigated the root cause and possible solutions to RDMA performance interference, there is no research to analyze and solve the performance interference from the RNIC architecture. Compared with the existing software approach, optimizing RNIC architecture can introduce less performance and CPU overhead. This paper addresses performance isolation by modeling, analyzing, and optimizing the transmit-side (TX) RNIC architecture. First, we introduce a baseline TX RNIC architecture to explain the existing performance interference. Then, we propose separate caching and slicing execution to avoid the bandwidth-sensitive tenants affecting latency-sensitive tenants. Later, we add isolated backpressure and adaptive Weighted Round-robin scheduling to ensure the bandwidth-sensitive tenants share the bandwidth equally. Our experiments show that these optimizations achieve near-optimal performance isolation.

Proceedings ArticleDOI
22 Jun 2023
TL;DR: NeuCloud as discussed by the authors proposes a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management, and leverages this abstraction to maximize the resource utilization, while achieving both performance and security isolation for VPU instances at runtime.
Abstract: Modern cloud platforms have been employing hardware accelerators such as neural processing units (NPUs) to meet the increasing demand for computing resources for AI-based application services. However, due to the lack of system virtualization support, the current way of using NPUs in cloud platforms suffers from either low resource utilization or poor isolation between multi-tenant application services. In this paper, we investigate the system virtualization techniques for NPUs across the entire software and hardware stack, and present our NPU virtualization solution named NeuCloud. We propose a flexible NPU abstraction named vNPU that allows fine-grained NPU virtualization and resource management. We leverage this abstraction and design the vNPU allocation, mapping, and scheduling policies to maximize the resource utilization, while achieving both performance and security isolation for vNPU instances at runtime.