Virtual Video Transcoding in the Cloud

An x86-based way to solve the problem of video processing in the cloud


Figure 1: Research has recently examined how traditional TV viewing is on the wane.

User habits are changing from traditional broadcast television consumption (see top “Linear” segment, Figure 1) to dynamic, on-demand and mobile video viewing (see bottom “Multiscreen” segment, Figure 1). Three recent studies underlined the rapid adoption of “non-traditional” viewing and the decline of the “TV-centric living room.”  At the same time the research underscores how second-screen usage, especially during special events such as the Super Bowl, are attracting more viewers.

In the U.K., for example, only 50 percent of adults online “consider the TV set as the focal point of their living rooms,” and 70% of adults routinely used a connected device while watching TV, the Internet Advertising Bureau UK (IAB) found in a recent study.

Among British viewers aged 16 to 34, 87 percent are using a second screen while in front of a TV set. Although the data represent the current British experience, the numbers are comparable to North American usage.

Two reports on U.S. Super Bowl viewing attested to the non-traditional viewing patterns. Think with Google, an advertising research unit within the media giant, monitored the ways in which U.S. TV viewers interacted with Super Bowl 50 and found that 82 percent of ad-related searches conducted during the game telecast were done on mobile phones, compared with 7 percent on tablets and 11 percent on desktop computers.

Separately, Localytics tracked the average number of apps launched during the Super Bowl by mobile users. It found that app usage ran high throughout the game, with multitasking viewers using an average of 3.2 social networks and 1.9 sports apps.

Streaming video to devices strains the networks, which are rapidly expanding their capability to handle both traditional linear broadcast video distribution and the more dynamic multiscreen video. The landscape of the carrier business is also changing for video. Multi-service operators (MSOs) and telecom service providers show a growing preference for cloud technology to provision adequate processing, streaming and storage resources for the media market. However, the current performance level of video in the cloud is far from optimal.

Accelerated Video Cloud

Typical cloud implementations are based on multiple identical servers that can be used as needed for varying tasks, with different scale and footprint requirements as specified by the application. The processors are Network Function Virtualization Infrastructure (NFVI) nodes in a sea of cloud computing resources.

For video, however, the server tasked with video streaming supports far less user capacity than a server used for web browsing, making the addition of video processing and transcoding resources to cloud servers a must.

An accelerated video cloud can address the growing preference by operators to use standard servers in the cloud, without the need for special appliances. It can also meet video applications’ increasing density and high-quality processing demands. The accelerated video cloud offers the ubiquity of standard server-based resources, but the added benefit of higher-performance and higher-density video processing needed to support today’s users.

Using Intel Processor Graphics

In 2011, Intel® introduced Quick Sync Video (QSV) on its integrated processor graphics with the Intel 2nd Generation Core™ product line. QSV builds on the already available decoder hardware from previous generations of the Intel HD processor graphics. It features a flexible architecture for encoding of H.264 with video quality and performance improvements in each future generation.

In 2013, with its Intel 4th Generation Core product line, the company introduced the Iris™ Pro processor graphics, which added MPEG2 encoder capability and increased video quality with the addition of more motion estimation engines, along with 128MB of embedded DRAM. Gained were a high bandwidth memory capability for the GPU (70 GBps compared to 25 GBps for the dual channel DDR3 memory interface).

Figure 2: Comparing approaches to performance, power and flexibility challenges.

Figure 2: Comparing approaches to performance, power and flexibility challenges.

In 2015, with the Intel 5th Generation Core product line, Intel increased the number of execution units in the Intel Iris Pro processor graphics from 40 to 48, provided 50 percent more estimation engine capability per slice and added a second multi-format to increase the decode and entropy coding capacity.

For transcoding, Quick Sync Video offers advantages over other architectures (Figure 2). For example, traditional hardware fixed functions feature low power and high performance, but are unable to move quickly should there be an update required for the code. On the other hand, software codecs put flexibility first, but do so at the expense of power and performance.

By using hardware for the functions of a codec that don’t change and software running on compute elements in the GPU for functions that can, Quick Sync Video balances high performance and low power demands. At the same time, it preserves much of the flexibility required to improve the codec over time.

Figure 3: The QSV ENC and PAK elements

Figure 3: The QSV ENC and PAK elements

Two main elements make up the Quick Sync Video solution. One is the “ENC,” which includes hardware acceleration called the Media Sampler. The Media Sampler provides efficient motion search and software running on the programmable Execution Unit array. A second element, the “PAK,” reuses logic from the Multi-Format Codec Engine. Its complete hardware unit performs pixel reconstruction, quantization, entropy encoding and the like.

The combination of hardware and software allows Quick Sync Video to greatly reduce the power requirements for transcode on Intel Architecture while simultaneously increasing the performance capability.

Conclusion: Benefits of This Approach

There are many benefits that can be achieved through video acceleration in a virtual network.

Benefit #1: Reduced Capital Equipment Spending

The benefits of an accelerated approach mainly stem from the reduction in server footprint to the datacenter, and the reduced complexity to manage those resources. Network Function Virtualization enables providers to change the type and level of resources needed dynamically, and this applies to video transcoding as the VNF in the use cases above.

Figure 4: The savings to the service providers in terms of CAPEX equate to spending 74-83 percent less on equipment alone.

Figure 4: The savings to the service providers in terms of CAPEX equate to spending 74-83 percent less on equipment alone.

Benefit #2: Power Savings and Reduced Overhead Cost

Figure 5: The savings to service providers in terms of OPEX equate to spending $925 per year versus $6,661 or $9,991 on an annual basis, a savings of 86-91 percent less power costs.

Figure 5: The savings to service providers in terms of OPEX equate to spending $925 per year versus $6,661 or $9,991 on an annual basis, a savings of 86-91 percent less power costs.

Benefit #3: Scalability

When network demands increase or decrease for video transcoding, this also allows scaling up and down of resources with lower cost, as the number of video transcodes can be addressed through add-on cards to a lower population of servers. Having a lower population of servers in the network contributes meaningful operating cost reductions as noted above. So, as service providers increase their services to provide premium OTT video services, add-on cards can gradually increase the density levels required without capital equipment expenditures as significant as traditional methods have offered to date.

Benefit #4: Ease of Use through Ubiquity of x86 processing in the Cloud

An x86-based way to solve the problem of video processing in the cloud has an important benefit for equipment vendors, in that Intel technology delivers a familiar and easy to use API to speed development and time to market. The Intel Media Server Studio enables the transition from a pure software model to a media-offloaded acceleration model with the same capability to run Windows, Linux, QuickSync video and API libraries—in a higher-density capacity that delivers the maximum number of streams per rack unit for video applications.

Artesyn offers a range of voice and video acceleration products with its SharpStreamer™ family of PCI Express cards, as well as 1U, 2U NEBS and 3U NEBS servers with third-party enabling and application layer software tested through an ecosystem verification program.


Linsey_MillerLinsey Miller is vice president of marketing for embedded computing at Artesyn Embedded Technologies. She has previously held senior sales and marketing positions with Emerson Network Power, Interphase and Verizon.

Share and Enjoy:
  • Digg
  • Sphinn
  • Facebook
  • Mixx
  • Google
  • TwitThis
Extension Media websites place cookies on your device to give you the best user experience. By using our websites, you agree to placement of these cookies and to our Privacy Policy. Please click here to accept.