Edge Computing Brings Performance to the Heterogeneous IoT



The Internet of Things (IoT) will transform the way we interact with each other and our devices and on how our devices interact with other devices. But to process, power and store the enormous quantity of IoT data requires we evolve architecture, from the data center to beyond the cloud and right to the edge where the sensors live.

To meet the real-time processing, data and analysis demands of the IoT, two fundamental changes to the Internet and its compute resources are essential. First, the network architecture needs to be re-envisioned, moving from a cloud-centric (one-to-many) architecture to a topology where substantial compute resources reside at the edge. There will still be cloud computing when needed, but edge-based processing in the nodes themselves or via localized Intelligent Gateways could better handle future Internet of Things (IoT) demands.
And secondly, although quite powerful, today’s multicore processors won’t be nearly powerful enough for tomorrow’s computing needs. Instead, heterogeneous processing resources—the kind that combines a CPU with a GPU and other specialized accelerators—will become increasingly popular as a means of meeting the growing performance demands of all kinds of devices, ranging from the edge to the data center.

amd1
Figure 1: Moving from a cloud-centric one-to-many architecture to edge computing can reduce traffic and latency and improve power performance. (Source: NTT EDGE COMPUTING PLATFORM.)

Innovation of Things
What is the IoT? Not to create a new buzzword, but the IoT is really the Innovation of Things. If we want to see the IoT become a reality, it will take great technology and market demand. There is no value in connecting things (appliances, devices) to the Internet for the sake of connecting things. It needs to be something where data is used to generate ideas, analytics or new capabilities that have not existed before to create value. Ultimately, the IoT will influence nearly every market. [Editor’s note: For an example of the IoT’s promise, see the Extension Media Intelligent Vending Machines blog.]

The industry is moving to develop a seamlessly connected embedded world where, for example: a heart monitor embedded into a shirt provides real-time data to the doctor, who can send updated prescriptions to our pharmacist, who can in turn send an alert to our smart watch as we drive home to say the medications are ready to collect…and for our car to automatically update the GPS route to the pharmacy, where we can arrive and pay for the prescriptions using our smartphone.

The key here is that the IoT is not just about connecting things—it is about innovating things. To do this, we need to look at the process a little differently and identify the processing requirements at multiple IoT nodes from the data source through the cloud to the back-end Web serving and processing.

There are a lot of foundational technologies that work well with the IoT. We’ve all seen the emergence of several popular architectures responsible for powering the different waves of the computing timeline: Power, x86, MIPS, and ARM®, to name a few. They have all been, or still are, responsible for enabling key advances in computing.

It’s notable that AMD is the only company to embrace and invest in both the 64-bit x86 and ARM architectures. By 2016, x86 and ARM architectures are expected to account for more than 80 percent of the embedded total addressable market (TAM) . Supporting both architectures gives AMD customers the capability to leverage their investments in software as well as balance processing needs depending upon the IoT node’s requirements. In order for the innovation to take place, different kinds of architectures—like x86, ARM and others—are going to be required if IoT breakthroughs are to happen.

One exemplary company success story is Nest. It created 802.11-connected thermostats. A thermostat is not new, but following the success of Nest, connected thermostats and the “smart home”—generally thought of as a home equipped with lighting, heating and electronic devices that can be controlled remotely by phone or computer—is now a market shaped by the IoT. Rather than promoting the IoT as a solution to a problem, we want to look at vertical markets and make the IoT valuable to consumers.

Back to the Future
It is clear from conferences like Embedded World and thousands of media headlines that the IoT is at the front of our minds. One would think the IoT is an overnight sensation. But it’s not. It’s been decades (and depending on the application—take agriculture for example—even centuries or millennia) in the making. The IoT encompasses a broad range of topics with completely new ideas, like wearables, and things that have been around for a long time, such as sensors, or industrial plant and agricultural machinery. In addition to avenues for new value, the IoT is shaping a lot of what is already there.

Back in the 1990s during the foundation of the IoT, an early milestone achieved was connecting one billion people to the Internet. The next phase was in the 2000s, when mobile computing, smartphones and tablets took off quickly, with numbers doubling to two billion connected devices . Now we’re working on the next phase, with an expectation of connecting at least 33 billion units by 2020, of which up to 26 billion are expected to be IoT devices.

While the one-to-one network demands scale exponentially, processing resources for all those devices and nodes do not. What that means is that the way we architect the infrastructure for the IoT needs to be very different from what came before. The first challenge for the IoT is re-envisioning the network architecture itself. One possibility is shown in Figure 1.

Living on the Edge
Let’s consider how the IoT is “architected”—if that’s the correct term since it actually grew in an ad hoc manner—from the nodes at the edge to the cloud. It’s a misnomer to say everything is done in the cloud and that client-to-cloud is the model for the IoT. Recall the 26 billion nodes mentioned above, and consider what it would mean if all that data, whether from a thermostat, traffic flow sensor, automated factory, or a surveillance camera, were to go up into the cloud.

All of these devices have dramatically different needs. For example, some may be low latency, some may be high bandwidth, and some may communicate only infrequently. To be more specific, in smart home automation, whether a garage door is open or closed is a binary signal. You can afford to figure this out with f = 1Hz or longer. Contrast this with autonomous vehicles that have a very different requirement in terms of data, latency and reliability (see Figure 1). I’d expect them to send fault-tolerant IoT telemetry in milliseconds or faster. Add data from lots of vehicles along with all of the other Internet traffic, whether from Skype, Netflix, YouTube or other sources, and suddenly sending all that data to the cloud is a massive problem, especially if your autonomous vehicle is expecting a rapid response.

The network topology and the distribution of compute resources will have to be different with the IoT. Whether in factory, in home or in the infrastructure, we will start to see new types of technology showing up closer to the edge of the network to make local decisions and avoid the backhaul to the cloud.

amd2
Figure 2: An evolved IoT topology doesn’t connect every node to the cloud; rather, connections are made and torn down based upon resource and communication requirements.

Max Headroom for Video
The data center is a good example. If you have a different IoT solution at the edge of the network from the one in your core network and from the one in the mainstream data center, you need three different software stacks, application stacks and development teams. We need to build scalability and interoperability to deploy these applications as they are needed in a much more flexible fashion.

When you look at what goes into the different networking applications you quickly find that it’s in flux. It used to be about packets or switches, but now it is all about signals, data and video. The amount of content going through a network based on video has grown dramatically. Today two-thirds (66 percent) of Internet traffic is video: forecasts predict it will be 80% in the next three years. Internet video feeding TVs doubled from 2012 to 2013 and is set to grow four fold by 2018. Consumer Video on Demand traffic will also double by 2018. (Source: Cisco Visual Networking Index, February 2015.)

With these kinds of numbers, it’s essential to have local video processing close to the edge, especially for an IoT node such as a security camera with facial recognition, or an intelligent vending machine with gesture control. This can include dedicated video transcoding, image processing and enhancement, or graphics rendering. The more heavy lifting involved—especially if response time is crucial—the more CPU, GPU or DSP processing is required at or near the source.

Evolving the IoT Network
Moving intelligence to the edge is key to evolving the network and making it efficient for IoT applications, nodes and sensors. But it’s all very nascent. We need to build a networking world that is extremely scalable in terms of architecture and investments. The new IoT network topology will rely on local resources and as-needed connections to other resources and the cloud (Figure 2). And not every node requires its own processing; some nodes can be grouped.

While local, edge-based computing makes sense embedded within some IoT nodes such as point-of-sale (POS) terminals or multimedia gaming machines, other IoT devices don’t need such heavy lifting. For example, remote pipeline monitors or valves report data only occasionally, so these groups of nodes can be controlled by a single IoT gateway, which itself provides connectivity to a cloud data center (Figure 3). Other IoT nodes such as surveillance cameras could benefit from a gateway since they need only report data when an event occurs such as motion, a recognized face, or other trigger.

amd3
Figure 3: An IoT gateway offers edge computing resources where they’re needed, is faster and is updatable for more connections and services to be added as required. For example, decisions can be made locally from surveillance camera video (inset) instead of streaming it all back to the cloud for analysis.

Moving to the edge computing model as the foundation of the evolved IoT network architecture positions processing closer to where it is relevant. It provides better performance and service, reduces traffic and lowers latency. But at the IoT node itself, a change in the type of processing resource(s) is also required. That is, we need more powerful compute resources.

Heterogeneous Computing Coming of Age
Today, the divide between CPUs and GPUs has largely disappeared. Heterogeneous computing brings together many different types of processors into one accelerated processing unit (APU). Developers can tap into vast amounts of compute power to increase application performance and enable new user experiences. Heterogeneous computing, specifically the use of the GPU as a co-processor to tackle complex, parallel workloads, is intrinsic to the IoT and a number of other applications.

In 2011, AMD released the world’s first APU, combining a powerful GPU with a powerful CPU. It allowed general purpose GPUs to be used to provide parallel compute capabilities that were, until recently, limited to supercomputers.

This was the first step, resulting in better performance for a class of applications along with a bill of materials revealing both cost savings and power advantages. The progression of the heterogeneous architecture is shown in Figure 4.

Next, physical integration moved devices that were connected off-chip by PCIe® onto the die for use in smartphones, tablets, laptop and desktop computers, embedded systems and game consoles.

amd4
Figure 4: The progression of integration within the APU model.

We are now moving into the heterogeneous systems era, and not a moment too soon, as the IoT looms large. Looking at all the connected devices and the reams of, and composition of, the data they create, we are seeing the need for more data parallelism, so more things are computing in parallel at the same time. An example might be video surveillance, where the video stream is first acted upon by a GPU to determine if faces are detected in the frame. This heavy-lifting facial recognition task is non-trivial, but once a face is detected, the relevant data set is then sent to a traditional high-performance CPU to match the data against stored look-up table data sets. In this example, two different kinds of processors worked on the same problem and in parallel, adding efficiency.

Better harnessing GPU technology leads to greater compute efficiency. If you look at the number of FLOPS in a GPU versus a CPU, it is an order of magnitude or more. Incorporating the GPU more effectively through a heterogeneous systems architecture allows software developers to more completely integrate that very strong computational capability into applications.

GPUs have more FLOPS and even TFLOPS of compute capability than CPUs, and extremely fast, parallel efficient floating-point multiprocessors, but they have a different memory space. Up until now, writing software that bridges the two processor types has required difficult programming languages and methods. AMD has solved one problem by putting the two processors together on the same device, an APU. This avoids sending traffic over PCIe, and integrating on silicon helps drive down costs, increase performance and reduce latency. There is still the problem of two memory sub systems.

The latest APUs provide hardware integration including coherent memory for CPU and GPU addressing. Just doing that is inherently a lot more efficient, as you do not have to move memory, you just move pointers. With the support provided through the HSA Foundation 1.0 specification, now mainstream tools such as C++ can be used to treat the CPU and GPU as peers. This is the full implementation of heterogeneous systems architecture and a huge advance. It enables easier software development harnessing the massively parallel GPU, can stimulate new applications across a variety of fields, allows “write once, run many” across multiple platforms ranging from smart phones to edge gateways to embedded systems to the data center, will help reduce microprocessor energy use and can accelerate performance. For some categories of applications that acceleration could be very significant. All of that will foster IoT development and execution.

The Payoff: A Faster, More Efficient IoT
The IoT is already bringing many changes today and has the potential for even more innovation. As discussed here, the changes that the IoT brings require a paradigm shift towards more intelligence being placed at the edge of the network. Combined with the increase in compute performance that can be achieved by parallel computing through heterogeneous processor architecture advances, this shift will bring an Innovation of Things.

The sooner we realize that we must work together, we will overcome the myriad hurdles to make the IoT possible. By working across open standards, alliances and diverse markets, and by continuing to innovate in the network and employ heterogeneous solutions, we will realize the vision of the Internet of Things. Based upon the way the market is moving quickly, we need to act fast.

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • TwitThis

Tags: ,