Expressing PCIe Confidence: Q&A with Dolphin Interconnect Solutions

Committed to going the distance (nine meters over copper according to the latest tests) with a standard that gives high-performance applications the low latency and high speed they demand.

Some of the systems tackling electronic warfare, radar and other demanding tasks are moving closer to the fast lane. That’s according to Dolphin Interconnect Solutions and Curtiss-Wright Defense Solutions. In July the two companies announced they were working together to make it easier for OpenVPX-based High Performance Embedded Computing systems to achieve low latency and high speed using the PCI Express (PCIe) fabric.

Founded in 1992, Oslo, Norway-based Dolphin specializes in doing interconnect solutions around various processor or processor-to-I/O communication.

Dolphin CEO Hugo Kohmann and the company’s VP Sales and Marketing, Herman Paraison, recently spoke with EECatalog about Dolphin’s ability to build upon PCI Express’s strengths, the growing number of people looking to use PCI Express (PCIe) as a network, and the company’s new concept for “loaning” efficiency. Edited excerpts follow.

EECatalog: What sets Dolphin’s solutions apart?

Figure 1: The Solid State Phased Array Radar System at Clear Air Force Station, Alaska. (Courtesy, Public Domain,

Figure 1: The Solid State Phased Array Radar System at Clear Air Force Station, Alaska. (Courtesy, Public Domain,

Herman Paraison, Dolphin Interconnect Solutions: Our key differentiators center on the advanced features available through our software. It enables embedded customers to connect multiple processors together, to connect processors to GPUs, and to connect FPGAs and processors together into a unified network. Users can mix and match and take advantage of all the low latency—and performance—characteristics of PCI Express.

EECatalog: What are some of the features within the PCI Express DNA that Dolphin is leveraging?

Paraison: PCI Express already exists within most systems—so most chipsets have PCI Express support. This gives companies like Curtiss-Wright Defense Solutions, which has partnered with us to bring our eXpressWare PCI Express Software Suite to its SBCs and DSPs, a time-to-market advantage. And it also gives them a performance advantage. PCI Express allows them to communicate at latencies in the 500-nanosecond range from processor to processor.

The PCI Express standard lets you add various devices to the system through the process of enumeration. We add the ability to do that across cable with some of our host adapter cards. And to accomplish this over longer distances. We’ve already demonstrated that our new PXH830 Host Adapter card enables distances of up to nine meters over copper.

Hugo Kohmann, Dolphin Interconnect Solutions: This opens up new opportunities for PCI Express to cover more of the data room. You can have 9 meters from the host to the switch, and then you can have another 9 meters on the other side. That makes it possible to use PCI Express to cover a pretty large area just with copper cables. Using standard fiber optics, the distance can be extended to 100 meters.

Figure 2: With device lending, multiple computers could use PCI Express devices, even if those devices resided on separate machines.

Figure 2: With device lending, multiple computers could use PCI Express devices, even if those devices resided on separate machines.

Paraison: Another way in which we help customers fully capitalize on PCI Express strengths stems from the unusual breadth of our APIs. Customers gain the advantage of accessing the entire roadmap for PCI Express without making major changes to their software. You’re able to move from Gen2 to Gen3 PCI Express. And also to support Gen4 when it arrives—there’s no need to change the upper level software code that you have already written using our APIs for Gen3.

Our APIs support standard TCP/IP as well as socket-based applications. We also provide a shared memory API for which customers can write their own custom applications, and we support this across various operating systems as well—ranging from Windows to Linux to VxWorks.

EECatalog: One of the things your implementation of PCI Express supports is reflective memory. Can you offer an example of how that might be used?

Paraison: Yes. With reflective memory, you can have a single source. Perhaps you’ve got a single FPGA, and now you want to distribute data to a number of nodes downstream, which can be FPGAs, storage or processors. That same data can be simultaneously sent to a number of end nodes, for processing or storage for example.

Let’s say you have a lot of sensor data and want to process that data quickly and translate it out in real time. For example, you might have a windowless vehicle that has sensors all around it to capture what surrounds the vehicle. You could have a case where that sensor feed gets sent into a node that then distributes it to a number of other nodes, and each of those nodes processes it—it could be for an image—it processes a piece of it very quickly, and then sends it back to a control node that supports viewing the whole screen.

This is applicable anywhere you need real-time processed images. It could be for semiconductor test equipment, where the need is to have a real-time understanding as to whether the wafers are good or bad based on processing very high-resolution images of the wafers. And you may want to distribute the data the sensors are collecting on the quality of the wafer to a number of nodes, process those nodes and then send it back. With PCI Express, because of its extremely low latency, you can speed up that process.

Kohmann: Although super low latency Ethernet cards can get latency down to around six microseconds, our latency in the same stack is down to one microsecond—so we are typically at least five to seven times faster than the Ethernet and have significantly less processing overhead.

That is on top of the Networking stack. At the shared memory API level we really don’t have anything to compare to because there is no other API or hardware kind of transport that enables CPUs to directly access remote memory—you can’t do that with Ethernet, you can’t do that with InfiniBand.

PCI Express also brings new unique features for clustering of compute resources and I/O devices. A kind of global address space between host and I/O devices enables data to flow between units without consuming valuable system memory bandwidth or compute resources. For example, data can flow between an NVMe drive and GPGPUs while the CPUs are doing other tasks.

EECatalog: What are you noticing about how PCI Express deployment has changed or grown?

Kohmann: Rather than using PCI Express hardware and software just for transport, for moving data from System A to System B or from System A to a remote I/O device, more and more people are looking to use PCI Express as a general network running various standard communication protocols.

EECatalog: What’s ahead for Dolphin?

Paraison: We are working on a concept that builds upon the strengths of PCI Express called device lending. With device lending, remote nodes can borrow devices from other nodes (Figure 2). For example, say you have a system with three GPUs in it and you have another system that has only one GPU in it. You may have an application that needs four GPUs in order to run efficiently. With our software you would be able to borrow the three GPUs from the remote system, and it would be as if all four GPUs were owned by the system that has one GPU in it.  And the result of that is that the application would be able to take full advantage of all four GPUs without any changes to the application at all.

This device-lending concept will work with any PCIe device. That could be an NVMe drive, that could be a gigabit Ethernet card, or a 10 Gigabit Ethernet card, or a 40 Gigabit Ethernet card—all of these devices could then be loaned from one system to the other, and then when they are finished with using them they can be released back to the lending system.

Kohmann: One use case I would like to mention is that of users who during the daytime have distributed use for a farm of GPUs, but during the night the system automatically gathers all the GPUs into a larger GPU farm and uses all the compute power to run one big job. The benefit is you don’t have to move the GPUs—the GPUs stay with each computer and the PCI Express network between computers is used to access the GPUs without any software overhead or changes to applications.

Share and Enjoy:
  • Digg
  • Sphinn
  • Facebook
  • Mixx
  • Google
  • TwitThis