DSP-Core Evolution Powers Advancements in Communications and Multimedia Technologies
Market-specific DSPs will form the basis of next-generation SoCs
Digital signal processors (DSPs) can be found in nearly any of our day-to-day consumer devices including mobile handsets, digital cameras, navigation devices, TVs, DVD players, and game consoles. They also are ubiquitous in multimedia, telecommunications, and networking applications. To tell why and when a DSP is necessary, a good starting point would be to investigate this from the application point of view. For example, mobile handsets usually require at least two DSP engines: one for communication tasks and the other for application processing. On the communications side, the voice signal on one end of the call needs to be digitalized and compressed (a typical DSP function), modulated onto a wireless signal (another DSP task), transferred through the wireless infrastructure to the other end of the call, and then demodulated and decompressed (DSP again). On the application-processing side of a mobile handset, data files containing video, pictures, and audio need to be decoded and sent to the device’s screen, speakers, and headsets. All of these are very typical DSP tasks, although they’re very different in nature than the DSP tasks required for the communication channel.
With wireless communications able to reach 100 Mbits/s on fourth-generation (4G) wireless channels, the tasks carried by DSPs have evolved quite significantly. The modulation and demodulation of such high bit rates require some advanced algorithms including multiple antennas (also known as multiple-input multiple-output or MIMO), multicarrier modulation, and quadrature-amplitude-modulation (QAM) schemes. Correspondingly, DSP engines have evolved to meet these requirements in various ways, thereby turning into application-specific (or communication-specific) DSPs.
Similar progression has been witnessed in other DSP fields, such as video processing. Screen resolution has increased steadily from VGA to full HD (known as 1080p) in recent years. In addition, complex video standards were defined in order to deal with these higher resolutions while maintaining bit rates that were as low as possible. Traditional DSP engines find it difficult to tackle these requirements. As a result, various architectures and design concepts were introduced including multicore designs and single-instruction, multiple-data (SIMD) processing. Here, the DSPs also have evolved into application-specific processors that specialize in multimedia tasks while maintaining their universality.
Methods For Tackling DSP Applications
There are several ways to obtain DSP functionality in a system-on-a-chip (SoC). The hardwired method, whereby algorithms are implemented completely in hardware, will not be covered in this article. Instead, we will focus on programmable and semi-programmable approaches to DSP implementations including central processing units (CPUs), microcontroller units (MCUs), and native DSPs.
Typical examples of CPUs include the Pentium and PowerPC processors. These extremely high-performance processors are general purpose in nature and run at high clock speeds of 2 GHz or more. Usually, they aren’t suitable for embedded applications. CPUs are used in systems like personal computers, workstations, and notebooks. They can handle signal processing. Because they run at extremely high frequencies, however, their high power consumption creates an issue with battery life. Also, such CPUs are usually too expensive for most consumer devices.
MCUs are at the other end of the scale from CPUs. You may think of MCUs as slower and smaller versions of CPUs like the Pentium. Companies like ARM Holdings and MIPS Technologies offer cores that target embedded applications. As such, they’re mainly suitable for control tasks and are capable of running operating systems (OSs) like Windows Mobile and Linux. Generally, MCUs lack typical DSP support, as their main focus is on the control plane. However, some MCUs offer specific extensions with DSP functionality. This will be covered later in this article.
The third way of obtaining DSP functionality is the native DSP. These DSPs are well suited for math-intensive and data-centric processing tasks. Because they’re designed for embedded applications, their power consumption and circuit size are suitable for use in an SoC. Typical examples of such native DSPs are TI’s C55 and C64, ADI’s Blackfin, and CEVA’s TeakLite and CEVA-X families of DSP cores. Figure 1 maps these various DSP solutions on an X-Y space, thereby describing DSP performance and flexibility. Hardwired accelerators also are mapped. In addition, the power consumption of each of the solutions is signified by their colors.

Figure 1: Here, various DSP solutions are mapped on an X-Y space to detail both DSP performance and flexibility. The power consumption of each solution is signified by its colors.
For their part, MCUs are fully programmable and general purpose in nature. Thus, they can perform various functions and are suitable for different applications in an SoC. But MCUs don’t achieve high performance in DSP functions. After all, they aren’t designed for this purpose.
Compared to MCUs, CPUs are faster at performing DSP functions if they are run in the 1-to-2-GHz range. Unless there’s sufficient battery capacity, however, it’s highly unlikely that a CPU can be used for DSP functions in a handheld product. Hard-wired engines obviously provide the highest DSP performance at the lowest power consumption. But flexibility is significantly impaired.
Thus far, native DSPs have provided a good tradeoff between all three aspects: flexibility, performance, and power consumption. With recent developments in various DSP applications, the embedded-processor landscape has evolved with new sub-categories.
As DSP Tasks Evolve, So Do DSP Processors
In recent years, DSP requirements in various applications have significantly evolved. An analysis of the evolution of audio requirements, for example, shows how various advancements that have taken place simultaneously have overloaded the DSP requirements by more than five-fold in a very short period of time. Here is a quick list of some of these advancements in home-audio applications:
- Increasing the number of channels from two (stereo) to 5.1 (typical Dolby Digital devices) to 7.1 (typical Blu-ray Disc players) per stream
- Increasing the number of streams that need to be decoded and mixed together from one (typical A/V receiver) to two (typical DVD use case) and three (Blu-ray Disc use case)
- Increasing the bit rate of each decoded stream from 48 kbits/s (MP3) to 24 Mbits/s (DTS-HD MA)
- Increasing the complexity of audio codecs and post-processing functions
A similar pattern can be found in video applications and the requirements they pose on today’s DSPs. There, higher resolutions, higher bit rates, and more complex video standards and tool-boxes have been introduced. When it comes to wireless-communications applications, analyzing the bit-rate evolution from 3G to 4G wireless standards clearly illustrates this trend. The increase is more exponential than linear. Figure 2 depicts these advancements in all three DSP fields.

Figure 2: New audio, video, and communications applications are placing more stringent demands on today’s DSPs.
In order to meet the mounting requirements for DSP horsepower, each of the three processor-based alternatives has evolved in somewhat different directions. MCU vendors have tried to approach the rising requirements by adding some DSP capabilities into their architecture and extending the ISA of these processors with dedicated DSP instructions. The ARM Cortex-A8 is a typical example, where a SIMD accelerator (the ARM NEON) is attached to a CPU. Some vendors take a slightly different direction, allowing their programmers to extend the MCUs with their own instructions. In both cases, such DSP extensions could provide a good system tradeoff for basic DSP functions. For mid- to high-level processing requirements, however, such MCUs would be far from viable solutions. They also present severe pitfalls in the product roadmap and reuse.
CPU vendors have come to understand that simply running the processor faster won’t be enough to meet rising requirements. Multi-processor architectures provide an alternative solution by enabling higher instruction-level-parallelism (ILP). In doing so, however, they further increase their incompetence to meet stringent power budgets in portable devices. For other, more demanding use cases, CPU vendors have simply decided to focus on their core expertise and forgo the heavy-duty DSP applications to other solutions. For example, no CPU (or multicore CPU) would be capable of effectively decoding video at 1080p resolution or running LTE baseband. Such tasks are left to market-specific DSPs.
These market-specific DSPs have become the alternative of choice for DSP vendors that are trying to meet high-level digital-signal-processing requirements. In the past, general-purpose DSPs have flourished by offering a high level of processing that was useful for various end markets. The reusability and universality of such catalog DSPs have made them very common in the market and widely adopted by developers and partners. While such general-purpose DSPs still serve some markets, a more market-specific approach is required in the higher tier—one that would use an ISA that supports market-specific features (e.g., 4×4 matrix calculations required for the latest wireless-communications standards). Such market-specific DSPs can serve a complete market more precisely, providing the much-required horsepower without limiting the reusability of such an architecture for various solutions in this market (e.g., a wireless-communications DSP that can be efficiently used for LTE, WiMAX, HSPA+, EV-DO, etc.). Figure 3 maps these evolving DSP solutions onto the same X-Y space shown earlier:

Figure 3: Market-specific DSPs can more easily serve a complete market by providing the required horsepower without limiting the architecture’s potential to be reused for other solutions in that market.
For MCU vendors, one way to move up the performance ladder is to offer DSP instructions that extend their architectures. Some vendors offer MCUs with a pre-defined DSP ISA extension (e.g., ARM NEON). Others offer a set of various possible DSP ISA extensions and let the user choose. Yet another set of vendors allows the licensee to come up with its own DSP instructions. In all cases, these DSP ISA extensions aren’t inherently embedded into the processor. As a result, they could be better regarded as slave accelerators to a central microcontroller.
One needs to be aware of certain pitfalls when considering the use of an extended MCU for advanced DSP chores:
- Adding functional units, such as MAC units and adders, won’t become effective unless you carefully take care of memory accesses. Typically, MCUs only support a single memory access at a time with large flat memories with various data hazard restrictions. These are very different from the most basic DSP architectures and would quickly deteriorate the added value of any DSP extension.
- Due to the nature of the control functions that they need to support, MCUs usually heavily rely on cached memory architectures with privilege modes and virtual memory support. DMAs, on the other hand, are less commonly used in MCUs. Hence, basic DMAs could be offered as an add-on to an MCU. In highly data-intensive DSP applications, advanced DMAs are a prerequisite due to the real-time nature of the application and its requirement for deterministic processing. As opposed to extended MCUs, advanced DSPs inherently support DMA mechanisms within the processor architecture.
- Extending an MCU with DSP instructions usually doesn’t involve any major changes in the addressing mechanism. Thus, the unique addressing modes that are very typical in DSP applications due to their predictable data access patterns (e.g., cyclic buffers and bit reverse) are usually not supported by such extended MCUs.
- Because such DSP extensions aren’t an inherent part of the MCU, they would require the usage of compiler intrinsics in order to make good use of these DSP capabilities. That is, straightforward C code could not be easily compiled into optimized assembly code running on such DSP extensions. For native DSP processors, such compiler support is essentially embedded in the compiler-architecture mutual design.
- MCUs typically don’t support the unique DSP data types required by some algorithms (e.g., the 10-bit and 12-bit elements common in advanced wireless applications). In addition, numeric accuracy in an MCU is usually limited to 32-bit. For some applications, a larger dynamic range is required—up to 72-bit data including guard bits, saturation hardware, and rounding mechanisms. These are usually supported by market-specific DSPs according to the requirements that are unique to an application.
- In such extended MCUs, the available parallelism usually limits the programmer from using the MCU and its DSP extensions at the same time. Even though the processing horsepower might be available, it cannot be utilized in an optimized way.
For extended MCUs that offer the programmer a set of various DSP extensions to choose from—or even the option to extend the ISA with his or her own set of DSP instructions—another issue is being “non-standard.” Such “a la carte” processor design could be appealing at first glance. Yet software maintenance will be extremely difficult because the designer will need to port his or her code over and over again from one architecture to another without being able to reuse the code. In addition, code that was developed by a third-party software vendor won’t necessarily run on the designer’s configuration because some instructions and mechanisms could be different. Because this becomes a proprietary processor architecture, the MCU vendor won’t be able to maintain its roadmap. Instead, the user (and creator) of this specific MCU configuration will need to keep maintaining his or her own processor roadmap in order to sustain a complete product roadmap.
With continuous advancements in wireless communications and multimedia technologies, the design challenges facing engineers are becoming increasingly complex. Traditional DSP engines are still viable solutions for many applications where general-purpose DSP horsepower is required for future enhancements. Examples include portable multimedia devices. For more complex tasks, such as 4G wireless-communications and HD video/audio processing, a new market-specific approach is needed. Multicore designs, which originated from the CPU domain and propagated into the MCU domain as well, are one method of increasing performance levels. Yet they also raise power consumption. Another method proposed by some MCU vendors is to offer various DSP ISA extensions. DSP vendors, on the other hand, are turning their new DSP designs into market-specific DSPs. The performance, flexibility, and efficiency of how these alternative solutions deliver this functionality remain the key issue.
All proposed DSP solutions—including native DSPs and CPUs—can and should co-exist in products that require the optimized performance and functionality provided by each type of device. As multicore platforms and SoC approaches continue to proliferate, designers should look for ways to incorporate DSP functionality that efficiently complements their CPU of choice, thereby reducing the system’s overall power consumption.
As discussed, extended MCUs treat DSP capabilities as slave accelerators instead of an inherent part of the architecture—as is typical for market-specific DSPs. Furthermore, extended MCUs—whereby the customer can define his or her own set of DSP instructions—can be considered a single-point solution. They cannot be used and reused for various products in the same product line. Such MCUs also lack roadmap and code compatibility. Ultimately, they lack the universality of market-specific DSPs, whereby a single DSP architecture can serve a complete product line for a specific market. In essence, these are the critical advantages delivered by market-specific DSPs. Such issues also explain why market-specific DSPs will form the basis of next-generation SoCs.

Eran Briman serves as Vice President of Corporate Marketing for CEVA. Previously, Mr. Briman served as Senior Director of Marketing, specializing in wireless communications and multimedia applications. Prior to that, he was the Chief Architect for CEVA, with overall responsibility for the research and development of next generation DSP Cores. Mr. Briman holds a B.Sc. in Electronic Engineering from Tel-Aviv University and an MBA from the Kellogg Business School in Northwestern University and holds several patents on DSP Technology.















