Combining the FPGA with the CPU: Q&A with Altera



FPG-Yay? Will OpenCL inspire cheers from software engineers?

At Super Computing 2015, where he comments, “each time I come here it seems that FPGA usage in the data center just keeps growing exponentially,” Altera’s Mario Maccariello took time away from the show floor to field EECatalog questions. Maccariello, who is business development manager in the company’s Computer Data Center business unit, spoke about OpenCL, how FPGAs and CPUs will play together, and Intel’s acquisition of Altera, among other topics.

EECatalog: Is OpenCL a game changer?

Mario Maccariello

Mario Maccariello

Mario Maccariello, Altera: OpenCL is a game changer because traditionally FPGAs have been great for offloading acceleration, but they have only been easily usable by engineers who understand hardware. If you look at the real world, for every one hardware engineer there are many thousands of software engineers.

We had to find a way of enabling all those software engineers to write code that will run on FPGAs. OpenCL allows them to do that. If they have knowledge of very standard languages like C++ and CUDA, they can write code that can be used by the Altera OpenCL compiler. The compiler then automatically generates the file that is used for FPGA programming.

That’s why we developed our OpenCL environment. It was done by our world class engineering team, largely based in Toronto.

OpenCL enables you to offload portions of the code from the CPU to an FPGA for acceleration through parallelization and pipelining. This happens in a way that is very software programmer friendly. The memory management and other system hardware management tasks, which in the past have been more challenging for a software programmer to handle, are taken care of in a more automatic way.

EECatalog: How about the floating-point operators built into the hardened DSP blocks which Altera now offers in its FPGAs?  What is the value of hard floating point to high performance computing?

Maccariello, Altera: Applications such as the data center can really benefit from the hard DSP floating-point being on the FPGA. Our customers tell us that time and time again. Our previous generation FPGA, the Stratix® V series, had some floating-point capability, but it was relatively modest. Our current generation, the Arria® 10 family, is built on a 20nm process, and there’s much more on-chip hard floating-point capability—something we developed in response to customer demand in the data center and other markets. Our newest family, the Stratix 10 FPGAs, is being fabricated on the Intel 14nm silicon process, and brings our hard floating-point capability to 10 tera flops. That becomes a very powerful engine for HPC algorithm acceleration in the data center.

Figure 1:  An engineering team based largely in Toronto developed Altera’s OpenCL environment. Courtesy en.wikipedia.org

Figure 1: An engineering team based largely in Toronto developed Altera’s OpenCL environment. Courtesy en.wikipedia.org

EECatalog: What is important to you to get across to embedded engineers about CPU-only solutions versus CPU plus FPGA solutions? What would you say to them?

Maccariello, Altera: I would say, try it out. Compare the CPU plus FPGA solution to the CPU-only solution, and you will see significant acceleration benefits in certain applications. Intel, by the way, gets this, and that’s one of the reasons the company is buying Altera.

A lot of software for compute-intensive applications such as convolutional neural networks, analytics, and search can be ported over to an FPGA in a very software engineer friendly way to achieve great benchmark results, again using the OpenCL environment.

Certain code runs very well on CPUs, but applications such as encryption, compression, and other security-related functions, as well as networking and virtualization functions are actually quite inefficient on a CPU. Many cores on typical multi-core CPUs are being hogged by these functions, which can operate much more efficiently on an FPGA. With an FPGA, you can use pipelining and parallelization, and put some very wide, massively parallel structures in place. So, for example, if you have a data stream coming off high-bandwidth memory, instead of having a CPU dealing with it sequentially, you can distribute it across the whole FPGA chip and potentially have hundreds or thousands of computational units managing it in parallel.

Today, a lot of work that is moving toward FPGAs is performed using GPUs, but GPUs are very, very power-intensive, and the FPGAs use up far less electricity. Benchmarks are application-dependent but I have seen typical numbers of on average one fifth of the power of a GPU. Many applications such as reverse time migration type algorithms for oil and gas seismic exploration will migrate across to FPGAs moving forward. At a recent conference, Microsoft described FPGAs as being its strategy for dealing with the end of Moore’s Law.

EECatalog: You mentioned earlier one reason Intel is buying Altera. What are some of the other reasons?

Maccariello, Altera: Intel sees a lot of applicability for FPGAs in a number of different markets—first and foremost in the data center. In a few years, Intel sees maybe a third of the cloud service provider nodes really being able to benefit from FPGA acceleration.

We have seen this mirrored in many other customers, including IBM and Microsoft.

Intel also sees FPGAs benefitting the IoT segment, running the gamut from consumer to automotive to industrial, as well as in applications where lots of data needs to be stored and analyzed in data centers. Intel sees a massive advantage in coupling its technology with FPGAs.

You may have seen a public roadmap from Intel that shows a CPU sitting next to an FPGA as two discrete devices, followed by a roadmap showing a multi-chip module with a CPU die and an FPGA die in a single package, allowing for very low latency and very high bandwidth interconnection between the two devices. The third step of that roadmap would be a monolithic piece of silicon—one chip that is part CPU and part FPGA. The FPGA can perform tasks such as compression, filtering, encryption and networking, and also accelerate workloads. That’s one reason why Intel bought us.

Another reason is that Intel wants to grow and help to expand Altera’s traditional FPGA business. Altera’s 12,000 plus customers can look forward to benefitting from Intel’s vast engineering and manufacturing expertise in generations of devices to come.

EECatalog: What should our readers understand about Altera and Altera as acquired by Intel?

Maccariello, Altera: I am going to boil it down into two messages. The first message is that in the standard FPGA business—where Altera has been extremely successful since we were founded—it’s business as usual. We are going to continue with that business—the low end, the mid range, the high performance. And we will have access to some of Intel’s advanced processes and technologies that, quite frankly, nobody else has.

The second message is really for the IoT and the data center. It is: watch this space. Combining the FPGA with the CPU will unleash some very cool features. We are about to witness a very exciting time in the industry.

Contact Information

Altera

101 Innovation Drive
San Jose, CA, 95134
USA

tele: (408) 544-7000
http://www.altera.com/

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • TwitThis