Device Lending Enables Composable Architecture



Creating a composable infrastructure by leveraging the latest PCIe standard equates to something like…using pencils in space. Sometimes it makes sense to think up a simple solution that’s merely crafty rather than succumb to the hype.

Keeping data center infrastructure ahead of rapidly increasing demands can get expensive. Real-time analytics, 5G connectivity, IoT, and Artificial Intelligence (AI) drive growth, but all this innovation is also pushing data centers to improve at a similar pace. For data centers, innovation brings a massive influx of data, shifting requirements, and insatiable business requirements. Data centers must flex while also meeting workload expectations, staying within an operating budget, maintaining efficiency, and leveraging innovation for a competitive edge within the data services market. The rise of hyperscale data centers, driven by big data, IoT, and AI, has massive networking loads supporting a considerable number of external and diverse clients. Hyperscale data centers illustrate the need for more efficient and flexible use of massive amounts of resources.

The volume and spectrum of cloud workloads add pressure that makes inflexibility a non-option. Traditional data architectures made up of servers, storage media, switches, and the like have been available in a large variety of form factors and sizes. The various pieces come together to serve a particular data workload in the data center. However, the workloads of today are changing rapidly, and traditional data center infrastructures cannot flex as fast as needed without adding many hours of labor. The complexity of traditional infrastructures has been mitigated somewhat by converged infrastructures, whereby the compute, storage, and networking fabric converge into a single solution to meet a particular workload. While converged infrastructures relieved hardware-centric challenges, they created another issue, as managing became workload-centric. Having started on the left, then swung far to the right, data center technology has found a sweet spot in composable architecture.

What is Composable Infrastructure?
Taking an application-centric approach, composable infrastructure is the answer to data center flexibility. Composable architecture is the next generation data center design—able to support rapidly changing system configurations, facilitate maximum sharing of both real and virtual infrastructure, and support new hardware technology. While similar to a converged infrastructure, composable infrastructure integrates compute, storage, and networking into a single platform by using a software-defined intelligence that maintains a pool of liquid resources. The application-centric composable infrastructure provides a new approach with which to provision and manage assets (both real and virtual). By using disaggregated programmable infrastructure as code, composable infrastructure seamlessly bridges software and hardware while eliminating management silos. The result is lower operating costs through “right-sizing,” and a higher level of flexibility.

Figure 1: Hyperscale data center networks support many different external clients. The growth of big data with IoT and machine learning/AI are pushing data center infrastructure. As of March 2017, the majority of hyperscale data centers were operated in the U.S.

Several major companies already provide what is also referred to as composable disaggregated infrastructure or composable architecture. For instance, the Intel® Rack Scale Design (RSD) architecture is a composable disaggregated architecture where “hardware resources, such as compute modules, nonvolatile memory modules, hard disk storage modules, FPGA modules, and networking modules, can be installed individually within a rack. These can be packaged as blades, sleds, chassis, drawers or larger physical configurations.”[i]  Resources include a high-bandwidth data connection through an Ethernet link to a dedicated management network, and at least one Ethernet or other high-speed fabric, such as PCI Express (PCIe). Often two networks, including the management network, are connected to top-of-rack (ToR) switches. Historically, ToR switching has been adopted for rack-at-a-time flexibility, through modular data centers. A rack can connect through ToR to other racks to create a management domain referred to as a pod. Even with composable architecture, pods can be linked together in any network topology that best suits the data center.

Figure 2: Composable architecture creates a pool of resources that can be deployed on demand in data centers.

A Lower-Cost Way to Create Composable Infrastructure
But there’s another way to compose resources on the fly to meet changing workloads, and without requiring an Intel RSD compatible rack. Comprehensive rack scale solutions are not always possible due to budget constraints. For machines with access to resources via PCIe, a lower-cost solution can extend composable architecture to existing data centers. A Norwegian company called Dolphin Interconnect Solutions has an elegant solution called device lending. Device lending is a simple software solution that allows one to reconfigure systems and reallocate resources within a PCIe Fabric. Accelerators (GPUs and FPGAs), NVMe drives, network cards or other network fabric “can be added or removed without having to be physically installed in a particular system on the network.”[ii]   Dolphin’s eXpressWare SmartIO software enables device lending, which creates seamless management of a pool of devices while maximizing resources. Device lending achieves both extremely low computing overhead and low latency without requiring any application-specific distribution mechanisms. A low-cost composable infrastructure is within reach, as a remote IO resource appears to applications as if local, with device lending software deployed in the PCIe Fabric. Dolphin has been involved with industry standards (including PCI, ASI, and PCIe) since the 1990s.

Device lending works transparently across PCIe connected racks and between servers and modules with no modifications to drivers, operating systems, or software applications. Device lending enables temporary access to a PCIe device located remotely over a PCIe network. Furthermore, performance in accessing a remote device is similar to accessing a local device, since there is no software overhead in the data transfers themselves. Devices are temporarily borrowed by any system within the fabric, and for as long as necessary. When a device is no longer needed, it can be returned to local use or allocated to another system. One can control the Dolphin device lending software using a set of command line tools and options, which can be used directly or integrated into any other higher-level resource management system, such one that might be used with an Intel RSD or different architecture. Dolphin’s device lending software does not require any particular boot order or power-on sequence. PCIe devices borrowed from a remote system can be used as if they were local devices until returned. Furthermore, Dolphin’s device lending strategy does not require explicit integration into a unified Application Programming Interface (API), since it works by taking advantage of the inherent properties of PCIe to accomplish a composable infrastructure. For more information about how device lending works with hot-adding (or hot-plugging), virtualization, non-transparent bridges (NTBs), IO Memory Management Units (IOMMUs), and DMA remapping, refer to the whitepaper,  Device Lending in PCI Express Networks by Lars Kristiansen, et. al. (PDF).

Device lending is an advanced application of the PCIe standard. PCIe is a stable, standard technology that is widely implemented. PCIe is also set to reach 128 GBps in full-duplex mode over 16 lanes with Gen 5. PCIe Gen 5 will be backward-compatible to prior generations and meet increasing performance needs. PCIe has latencies as low as 300ns end-to-end, dominates I/O bus technology, and has been prolific in the server, storage, mobile, and other markets. PCIe is also a significant player in connecting cloud-based devices that demand the highest performance in interconnects, such as GPU and FPGA accelerators for machine learning and AI. Hyperscale data centers number in the hundreds, with some of the world’s most massive run by Google, Facebook, Amazon, and China’s Baidu.

Figure 3: Device lending leverages the PCIe standard at the PCIe level on the stack, so integrating with other APIs, special bootloaders, and power sequencing are not needed. NTB= non-transparent bridge. (Source: dolphinics.com)

Performance of Device Lending for Composable Architectures
Device lending leverages the PCIe standard to achieve low latency and high bandwidth. Performance accessing a remote device will be very similar to a local device, limited only by the speed of PCIe over longer distances, if any. Dolphin’s eXpressWare SmartIO device lending software does not require personnel to make changes to transparent devices or to a Linux kernel. Borrowed devices get inserted into the local device tree and the transparent device driver receives a “hot-plug” event signaling that a new resource is available. According to Dolphin, “If the transparent driver needs to re-map a DMA window, the re-map will be performed locally at the borrowing side, very similar to what happens in a virtualized system. The actual performance is system and device dependent.” ii

Figure 4:  Comparison of bandwidth performance using device lending for a borrowed device (Borrowed) over a physically local device (Local). (Source: Dolphin ­Interconnect Solutions)

The Cisco, Intel, and Hewlett Packard Enterprise (HPE) strategies for accomplishing composable architectures are not that different from device lending software in that they achieve the same goals. HPE promises “a hybrid IT engine for your digital transformation,” but is also software-defined.[iii]  Legend has it that it makes more sense to use a pencil in space rather than design a pen without gravity-fed ink. Urban legends aside, Dolphin’s software is definitely a clever way to use the PCIe standard to a low-cost advantage for creating or complementing a composable architecture. The ability to break down fixed compute, storage, and networking fabric into a liquid pool of resources is more than desirable. Composing workloads on demand are making headlines in the IT world, and it doesn’t have to have a fancy title to get the job done. Add device lending to the buildup of excitement about composable architecture for meeting the next level of flexibility in data centers.

[i] “Intel® Rack Scale Design (Intel® RSD) Architecture White Paper.” Intel, www.intel.com/content/www/us/en/architecture-and-technology/rack-scale-design/rack-scale-design-architecture-white-paper.html.

[ii] Kristiansen, Lars, et al. “Device lending in PCI Express Networks.” 13 May 2016, pp. 1–6., www.dolphinics.com/download/WHITEPAPERS/PCI_Express_device_lending_may_2016.pdf.

[iii] https://www.hpe.com/us/en/solutions/infrastructure/composable-infrastructure.html, accessed June 4, 2018.


Lynnette Reese is Editor-in-Chief, Embedded Intel Solutions and Embedded Systems Engineering, and has been working in various roles as an electrical engineer for over two decades. She is interested in open source software and hardware, the maker movement, and in increasing the number of women working in STEM so she has a greater chance of talking about something other than football at the water cooler.

Tags:

Extension Media websites place cookies on your device to give you the best user experience. By using our websites, you agree to placement of these cookies and to our Privacy Policy. Please click here to accept.