PCI Express Roadmap Is More Than Speed Bumps

Not content with just cranking up the bus speed, the PCI-SIG® is driving new protocol extensions that should increase the usage and adoption of PCI Express®. In addition to increasing the clock rate by more than three times, the PCIe® roadmap addresses virtualization, latency, and power management. These enhancements increase the coupling between I/O and compute subsystems. They also allow data from peripherals to be processed faster. Furthermore, PCIe is becoming more capable of serving other high-performance system needs, such as links between central processing units (CPUs) and special-function accelerators and backplane applications.
During the Intel Developer Forum in September, Intel Fellow Ajay Bhatt provided a chronology of PCI Express (see the Figure). This year, PCIe 2.0 and two virtualization specifications were approved. Early details of PCIe 3.0 extensions also were released.
PCIe 2.0 Ready for the Holidays
It’s a great time to be a PC gamer, as systems supporting PCIe 2.0 are starting to hit retail shelves. Ever since PCIe 1.0 began replacing AGP back in 2003, high-end graphics cards have been early adopters of the latest PCIe technology. Dell and ASUS launched workstation and high-performance desktop PC motherboards with x16 slots of PCIe 2.0 supported by the Intel® X38 Express chip set. Video enthusiasts can add graphics cards based on the NVIDIA GeForce 8800 GT graphics processing unit (GPU) or the ATI Radeon HD 3800 series of GPUs.
The most notable PCIe 2.0 enhancement is the doubling of the transfer rate over PCI 1.x from 2.5 giga-transfers per second (GT/s) to 5 GT/s. Other new features include dynamic link speed management and improved control and alert services.
Virtualization Reduces Cost and Complexity
Seemingly every industry publication is espousing the benefits of virtualization in data centers. Virtualization deployed on server clusters (i.e., pools of independent servers working together as a single system to provide high availability of services) facilitates load-balancing and the re-allocation of resources to services in high demand. The PCI-SIG is working on bus enhancements that address virtualization and other I/O demands of high-performance infrastructure, such as storage and networking.
In the past, a PCIe link established a oneto- one relationship between an endpoint (peripheral) and a specific root complex. That link was associated with a single software image running on the CPU at the root complex. As a result, hypervisors servicing multiple virtual machines (VMs) were responsible for address translation when multiple VMs shared network and storage adapters. In addition to increasing latency, this scheme raised hypervisor complexity.
The PCI-SIG developed several specifications that provide two levels of I/O virtualization (IOV). First, the Address Translation Services (ATS) specification provides a set of transactions for PCI Express components to exchange and use translated addresses in support of I/O virtualization. Secondly, Single Root IOV allows multiple operating systems running simultaneously within a single computer– single root topography–to natively share PCIe devices.
Currently under review, the Multi-Root IOV specification extends virtualization support to multiple root topographies, such as blade servers. This aspect will simplify the sharing of I/O devices between software applications and server boards. Applications will therefore be able to access I/O and storage devices throughout the network. In addition, network and storage adapters will be able to reside on switches rather than on every server blade, which reduces component count and system complexity. Today, most blades with root complexes have their own network adapters. This feature adds cost and redundancy as peripherals and ports are proliferated across the network infrastructure. In the future, shared network adapters should simplify I/O load balancing and bandwidth management within a virtualized environment.
PCIe 3.0 on the Drawing Board
In August, the PCI-SIG released some details about the next-generation PCIe architecture, PCIe 3.0, including its 8 GT/s bit rate and backwards compatibility to prior generations. Completion of the specification is expected in late 2009. It will target products for 2010 and beyond. The PCIe 3.0 specification is expected to make significant improvements in throughput, latency, and power management while incorporating the virtualization enhancements discussed previously. Vendors backing PCIe see opportunities to develop higher-performance accelerators that speed up specific tasks like video, encryption, XML, and data-mining functions.
PCIe 3.0 on the Drawing Board In August, the PCI-SIG released some details about the next-generation PCIe architecture, PCIe 3.0, including its 8 GT/s bit rate and backwards compatibility to prior generations. Completion of the specification is expected in late 2009. It will target products for 2010 and beyond. The PCIe 3.0 specification is expected to make significant improvements in throughput, latency, and power management while incorporating the virtualization enhancements discussed previously. Vendors backing PCIe see opportunities to develop higher-performance accelerators that speed up specific tasks like video, encryption, XML, and data-mining functions.
One challenge to preserving backward compatibility will be the transition away from 8b/10b encoding to a scrambling technique. Here, a known binary polynomial is applied to the data stream. The 8b/10b is a code that maintains DC balance on differential signal lines. On average, 2 bits are added for every 8 data bits, providing enough bus state changes to prevent common-mode voltage shifts and allow clock recovery. Scrambling introduces more DC wander than 8b/10b. As a result, the receiver (Rx) circuit must either tolerate the DC wander as a reduction in signal margin or implement a DC-wander correction capability. The choice for the scrambling polynomial is currently under study.
By adding non-data bits to the data stream, 8b/10b encoding imposes a 20% overhead on the raw bit rate. By transitioning to scrambling, PCIe 3.0 supports twice the throughput of PCIe 2.0 even though the bit rate increases by just 60% from 5 GT/s to 8 GT/s. This is illustrated in the figure, where PCIe 2.0 and 3.0 achieve approximately 16 GB/s and 32 GB/s, respectively, for a x16 link.
The increase in throughput will simplify board designs by reducing lane count. For example, applications deploying 10 Gigabit Ethernet typically require four PCIe 2.0 lanes. In the future, that requirement can be reduced to two lanes with PCIe 3.0. This fanout improvement will ease the design of dense switching boards.
Latency, Software, and Power Enhancements
PCIe 3.0 developers are seeking a tighter coupling between I/O and compute subsystems. They’re investigating multiple protocol extensions and enhancements that help server CPUs access priority I/O data more quickly. There will be mechanisms that provide “data re-use hints,” which improve the caching of reusable data in system memory. In doing so, they reduce data latency. Another enhancement supports both transaction attributes and hints that optimize the ordering of transactions within the root complex and memory subsystem. Also under consideration are Pause and Resume operations. They control the interrupts of low-priority transmissions and allow higher-priority transactions to take precedence.
As PCIe evolves beyond standard I/O interconnect to hardware accelerators, the specification requires enhancements to help developers maintain memory coherency within a system. Unlike most I/O adapters, hardware accelerators have their own local memory that must remain coherent with other memory subsystems on the network. PCIe 3.0 is investigating software model enhancements, such as atomic read-modify-write mechanisms, that prevent network elements from accessing stale or corrupt data. These mechanisms help to avoid having a CPU access data while a special-function accelerator is working on it.
As PCIe evolves beyond standard I/O interconnect to hardware accelerators, the specification requires enhancements to help developers maintain memory coherency within a system. Unlike most I/O adapters, hardware accelerators have their own local memory that must remain coherent with other memory subsystems on the network. PCIe 3.0 is investigating software model enhancements, such as atomic read-modify-write mechanisms, that prevent network elements from accessing stale or corrupt data. These mechanisms help to avoid having a CPU access data while a special-function accelerator is working on it.
Power-management features will be added to support dynamic performance/power operation modes. System software will be able to dynamically adjust the power consumption of endpoints in accordance with I/O throughput requirements.
PCI Express in Backplane Applications?
Today, Ethernet is dominant in the backplane space. Its large installed base, supported by a large software investment, has proven to be very dependable. Yet some developers are asking whether PCI Express may make inroads as bus speeds increase to 10 Gbits. Like Ethernet, PCI Express is becoming ubiquitous- -with many chipsets supporting it natively. It also leverages years of legacy software dating back to the 1990s.
PCI Express offers some advantages over Ethernet. It is far more scalable with options to utilize links with different lane sizes–from x1 all the way up to x32. PCI Express also has lower latency and overhead (packet header) than Ethernet, making it faster and more efficient. These differences are magnified when packet sizes are small, as in control-plane applications. PCI Express also has several quality-of-service (QoS) features, such as flow control and guaranteed error-free packets and delivery. In contrast, Ethernet is less proactive and requires receivers to notify the sender of dropped packets.
During the transition from 1 to 10 Gbits, some developers–-who are utilizing smaller payloads with heavy processing requirements–may give PCI Express consideration for the backplane. Applications like communications, military, and high-end medical equipment may benefit from greater scalability, lower overhead, lower latency, and the cost efficiency of PCI Express.
PCI Express technology has moved well beyond graphics applications to expand into communications, embedded systems, and home entertainment. With the addition of I/O virtualization and system-level enhancements, those who saw PCIe as a bridging or transitional technology may have a change of heart. The combination of a large installed base, extensive ecosystem, and long-term view may broaden the adoption of PCIe into emerging applications and new usage models.








