From ARM TechCon: Two Companies Proclaim IoT “Firsts” in mbed Zone

UPDATE: Blog updated 14 Dec 2015 to correct typos in ARM nomenclature. C2

Showcased at ARM’s mbed Zone, Silicon Labs and Zebra Technologies show off two IoT “Firsts”.

ARM’s mbed Zone—a huge dedicated section on the ARM TechCon 2015 exhibit floor—is the place where the hottest things for ARM’s new mbed OS are shown. ARM’s mbed is designed to make it easy to securely mesh IoT devices and their data to the cloud. Introduced at TechCon 2014 a year ago, mbed was just a concept; now it’s steps closer to reality.

Watch This!

The wearables market is one of three focus areas for ARM’s development efforts, along with Smart Cities and Smart Home. ARM’s first wearable dev platform is a smart watch worn by ARM IoT Marketing VP Zach Shelby and shown in Figure 1. It’s based on ARM’s wearables reference platform featuring mbed OS integration—with a key feature being power management APIs.

Figure 1: ARM’s smart watch development proof-of-concept, worn by ARM IoT Marketing VP Zach Shelby at ARM TechCon 2015.

Figure 1: ARM’s smart watch development proof-of-concept, worn by ARM IoT Marketing VP Zach Shelby at ARM TechCon 2015.

According to IoT MCU and sensor supplier Silicon Labs, which helped co-develop the APIs with ARM, they “provide a foundation for all peripheral interactions in mbed OS” but are designed with low power in mind and long battery life. No one wants to charge a smart watch during the day: that’s a non-starter. The APIs assure things like minimal polling or interrupts, placing peripherals in deep sleep modes, and basically wringing every power efficiency out of systems designed for long battery life. Mbed OS clearly continues ARM’s focus on low power, but emphasizes IoT ease-of-design.

In the mbed Zone, Silicon Labs was showing off their version of ARM’s smart watch which they call Thunderboard Wear. It’s blown up into demo board size and complete with Silicon Labs’ custom-designed blood pressure and ambient light sensors (Figure 2). The board is based on the company’s ARM Cortex-M3-based EFM32 Giant Gecko SoC.  Silicon Labs’ main ARM TechCon announcement—and the reason they’re in the mbed OS Zone—is that all Gecko MCUs now support mbed OS. We’ll dig into what this means technically in a future post.

Figure 2: Silicon Labs’ version of ARM’s smart watch—blown up into demo board size and complete with Cortex-M3 Giant Gecko MCU and BP sensor. The rubber straps remind that this is still “wearable”, though only sort-of.

Figure 2: Silicon Labs’ version of ARM’s smart watch—blown up into demo board size and complete with Cortex-M3 Giant Gecko MCU and BP sensor. The rubber straps remind that this is still “wearable”, though only sort-of.

P1060552

“Hello Chris”

Further proving the growing veracity of mbed OS and its ecosystem is the Zebra Technologies “wireless mbed to cloud” demo shown during Atmel’s evening reception at ARM TechCon (they’re also in the mbed Zone). Starting with Atmel’s ATSAMW25-PRO demo board plus display add-on (Figure 4) containing ARM Cortex-M3 and Cortex–M4 Atmel SoCs, Zebra demonstrated communicating directly from a console to the WiFi-equipped demo.

Figure 4: Zebra Technologies demonstrates easy wireless connectivity to IoT devices using Atmel’s SAMW25 MCU board and OLED1 expansion board.

Figure 4: Zebra Technologies demonstrates easy wireless connectivity to IoT devices using Atmel’s SAMW25 MCU board and OLED1 expansion board.

Typing “Hello Chris” into Zebra’s Zatar browser-based software console (Figure 5), the sentence appeared on the tiny display almost immediately. More than a hat trick, the demo shows the promise of the IoT, ARM cores, and the interoperability of mbed OS connected all the way back to the cloud and the Zatar device portal.

Figure 5: Zebra’s Zatar IoT cloud console dashboard.

Figure 5: Zebra’s Zatar IoT cloud console dashboard.

Zebra’s Zatar cloud service works with Renesas’s Synergy IoT platform, Freescale’s Kinetis MCU, and of course Atmel’s SoC’s (will Atmel also create their own end-to-end ecosystem?). The  Zebra “IoT Kit” demoed at TechCon is “the first mbed 3.0 Wi-FI kit that offers developers a prototype to quickly test drive IoT,” said Zebra Technologies. If you’re familiar with ARM’s mbed OS connectivity/protocol stack diagram, Zebra uses the COAP protocol to connect devices to the cloud. The company was a COAP co-developer.

The significance of the demo is multifold: quick development time using established Atmel hardware; cloud connectivity using Wi-Fi; an open-standard IoT protocol, and the solution is compliant with ARM’s latest mbed OS 3.0.

The fact that the Zatar console easily connects to multiple vendor’s processors means thousands or tens of thousands of IoT nodes can be quickly controlled, updated, and data queried with minimal effort. In short: creating wireless IoT products and using them just got a whole lot easier.

Zebra will be selling the Zebra ARM mbed IoT Kit for Zatar via distributors and more information is available on their website at www.zatar.com/IoTKit.

 

Quiz question: I’m an embedded system, but I’m not a smartphone. What am I?

In the embedded market, there are smartphones, automotive, consumer….and everything else. I’ve figured out why AMD’s G-Series SoCs fit perfectly into the “everything else”.

amd-embedded-solutions-g-series-logo-100xSince late 2013 AMD has been talking about their G-Series of Accelerated Processing Unit (APU) x86 devices that mix an Intel-compatible CPU with a discrete-class GPU and a whole pile of peripherals like USB, serial, VGA/DVI/HDMI and even ECC memory. The devices sounded pretty nifty—in either SoC flavor (“Steppe Eagle”) or without the GPU (“Crowned Eagle”). But it was a head-scratcher where they would fit. After-all, we’ve been conditioned by the smartphone market to think that any processor “SoC” that didn’t contain an ARM core wasn’t an SoC.

AMD’s Stephen Turnbull, Director of Marketing, Thin Client markets.

AMD’s Stephen Turnbull, Director of Marketing, Thin Client markets.

Yes, ARM dominates the smartphone market; no surprise there.

But there are plenty of other professional embedded markets that need CPU/GPU/peripherals where the value proposition is “Performance per dollar per Watt,” says AMD’s Stephen Turnbull, Director of Marketing, Thin Clients. In fact, AMD isn’t even targeting the smartphone market, according to General Manager Scott Aylor in his many presentations to analysts and the financial community.

AMD instead targets systems that need “visual compute”: which is any business-class embedded system that mixes computation with single- or multi-display capabilities at a “value price”. What this really means is: x86-class processing—and all the goodness associated with the Intel ecosystem—plus one or more LCDs. Even better if those LCDs are high-def, need 3D graphics or other fancy rendering, and if there’s industry-standard software being run such as OpenCL, OpenGL, or DirectX. AMD G-Series SoCs run from 6W up to 25W; the low end of this range is considered very power thrifty.

What AMD’s G-Series does best is cram an entire desktop motherboard and peripheral I/O, plus graphics card onto a single 28nm geometry SoC. Who needs this? Digital signs—where up to four LCDs make up the whole image—thin clients, casino gaming, avionics displays, point-of-sale terminals, network-attached-storage, security appliances, and oh so much more.

G-Series SoC on the top with peripheral IC for I/O on the bottom.

G-Series SoC on the top with peripheral IC for I/O on the bottom.

According to AMD’s Turnbull, the market for thin client computers is growing at 6 to 8 percent CAGR (per IDC), and “AMD commands over 50 percent share of market in thin clients.” Recent design wins with Samsung, HP and Fujitsu validate that using a G-Series SoC in the local box provides more-than-ample horsepower for data movement, encryption/decryption of central server data, and even local on-the-fly video encode/decode for Skype or multimedia streaming.

Typical use cases include government offices where all data is server-based, bank branch offices, and “even classroom learning environments, where learning labs standardize content, monitor students and centralize control of the STEM experience,” says AMD’s Turnbull.

Samsung LFDs (large format displays) use AMD R-Series APUs for flexible display features, like sending content to multiple displays via a network. (Courtesy: Samsung.)

Samsung LFDs (large format displays) use AMD APUs for flexible display features, like sending content to multiple displays via a network. (Courtesy: Samsung.)

But what about other x86 processors in these spaces? I’m thinking about various SKUs from Intel such as their recent Celeron and Pentium M offerings (which are legacy names but based on modern versions of Ivy Bridge and Haswell architectures) and various Atom flavors in both dual- and quad-core colors. According to AMD’s  published literature, G-Series SoC’s outperform dual-core Atoms by 2x (multi-display) or 3x (overall performance) running industry-standard benchmarks for standard and graphics computation.

And then there’s that on-board GPU. If AMD’s Jaguar-based CPU core isn’t enough muscle, the system can load-balance (in performance and power) to move algorithm-heavy loads to the GPU for General Purpose GPU (GPGPU) number crunching. This is the basis for AMD’s efforts to bring the Heterogeneous System Architecture (HSA) spec to the world. Even companies like TI and ARM have jumped onto this one for their own heterogeneous processors.

G-Series: more software than hardware.

G-Series: more software than hardware.

In a nutshell, after two years of reading about (and writing about) AMD’s G-Series SoCs, I’m beginning to “get religion” that the market isn’t all about smartphone processors. Countless business-class embedded systems need Intel-compatible processing, multiple high-res displays, lots of I/O, myriad industry-standard software specs…and all for a price/Watt that doesn’t break the bank.

So the answer to the question posed in the title above is simply this: I’m a visually-oriented embedded system. And I’m everywhere.

This blog was sponsored by AMD.

 

 

Move Over Arduino, AMD and GizmoSphere Have a “Jump” On You with Graphics

The UK’s National Videogame Arcade relies on CPU, graphics, I/O and openness to power interactive exhibits.

Editor’s note: This blog is sponsored by AMD.

When I was a kid I was constantly fascinated with how things worked. What happens when I stick this screwdriver in the wall socket? (Really.) How come the dinner plate falls down and not up?

Humans have to try things for ourselves in order to fully understand them; this sparks our creativity and for many of us becomes a life calling.

Attempting to catalyze visitors’ curiosity, the UK’s National Videogame Arcade (NVA) opened in March 2015 with the sole intention of getting children and adults interested in videogames through the use of interactive exhibits, most of which are hands-on. The hope is that young people will first be stimulated by the games, and secondly that they someday unleash their creativity on the videogame and tech industries.

The UK's National Videogame Arcade promotes gaming through hands-on exhibits powered by GizmoSphere embedded hardware.

The UK’s National Videogame Arcade promotes gaming through hands-on exhibits powered by GizmoSphere embedded hardware.

 Might As Well “Jump!”

The NVA is located in a corner building with lots of curbside windows—imagine a fancy New York City department store but without the mannequins in the street-side windows. Spread across five floors and a total of 33,000 square feet, the place is a cooperative effort between GameCity (a nice bunch of gamers), the Nottingham City Council, and local Nottingham Trent University.

The goal of pulling in 60,000 visitors a year is partly achieved by the NVA’s signature exhibit “Jump!” that allows visitors to experience gravity (without the plate) and how it affects videogame characters like those in Donkey Kong or Angry Birds. Visitors actually get to jump on the Jump-o-tron, a physics-based sensor that’s controlled by GizmoSphere’s Gizmo 2 development board.

The Jumpotron uses AMD's G-Series SoC combining an x86 and Radeon GPU.

The Jumpotron uses AMD’s G-Series SoC combining an x86 and Radeon GPU.

The heart of Gizmo 2 is AMD’s G-Series APU, combining a 64-bit x86 CPU and Radeon graphics processor. Gizmo 2 is the latest creation from the GizmoSphere nonprofit open source community which seeks to “bring the power of a supercomputer and the I/O capabilities of a microcontroller to the x86 open source community,” according to www.gizmosphere.org.

The open source Gizmo 2 runs Windows and Linux, bridging PC games to the embedded world.

The open source Gizmo 2 runs Windows and Linux, bridging PC games to the embedded world.

Jump!” allows visitors to experience—and tweak—gravity while examining the effect upon on-screen characters. The combination requires extensive processing—up to 85 GFLOPS worth—plus video manipulation and display. What’s amazing is that “Jump!”, along with many other NVA exhibits, isn’t powered by rackmount servers but rather by the tiny 4 x 4 inch Gizmo 2 that supports Direct X 11.1, OpenGL 4.2x, and OpenCL 1.2. It also runs Windows and Linux.

AMD’s “G” Powers Gizmo 2

Gizmo 2 is a dense little package, sporting HDMI, Ethernet, PCIe, USB (2.0 and 3.0), plus myriad other A/V and I/O such as A/D/A—all of them essential for NVA exhibits like “Jump!” Says Ian Simons of the NVA, “Gizmo 2 is used in many of our games…and there are plans for even more games embedded into the building,” including furniture and even street-facing window displays.

Gizmo 2’s small size and support for open source software and hardware—plus the ability to develop on the gamer’s Unity engine—makes Gizmo 2 the preferred choice. Yet the market contains ample platforms from which to choose. Arduino comes to mind.

Gizmo 2's schematic.

Gizmo 2′s schematic. The x86 G-Series SoC is loaded with I/O.

Compared to Arduino, the AMD G Series SoC (GX-210HA) powering Gizmo 2 is orders of magnitude more powerful, plus it’s x86 based and running at 1.0GHz (the integral GPU runs at 300 MHz). This makes the world’s cache of Intel-oriented, Windows-based software and drivers available to Gizmo 2—including some server-side programs. “NVA can create projects with Gizmo 2, including 3D graphics and full motion video, with plenty of horsepower,” says Simons. He’s referring to some big projects already installed at the NVA, plus others in the planning stages.

“One of things we’d like to do,” Simons says, “is continue to integrate Gizmo 2 into more of the building to create additional interactive exhibits and displays.” The small size of Gizmo 2, plus the wickedly awesome performance/graphics rendering/size/Watt of the AMD G-Series APU, allows Gizmo 2 to be embedded all over the building.

See Me, Feel Me

With a nod to The Who’s (1) rock opera Tommy, the NVA building will soon have more Gizmo 2 modules wired into the infrastructure, mixing images and sound. There are at least three projects in the concept stage:

  • DMX addressable logic in the central stairway.  With exposed cables and beams, visitors would be able to control the audio, video, and possibly LED lighting of the stairwell area using a series of switches. The author wonders if voice or other tactile feedback would create all manner of immersive “psychedelic” A/V in the stairwell central hall.
  • Controllable audio zones in the rooftop garden. The NVA’s Yamaha-based sound system already includes 40 zones. Adding AMD G-Series horsepower to these zones would allow visitors to create individually customized light/sound shows, possibly around botanical themes. Has there ever been a Little Shop of Horrors videogame where the plants eat the gardener? I wonder.
  • Sidewalk animation that uses all those street-facing windows to animate the building, possibly changing the building’s façade (Star Trek cloak, anyone?) or even individually controlling games inside the building from outside (or presenting inside activities to the outside). Either way, all those windows, future LCDs, and reams of I/O will require lots more Gizmo 2 embedded boards.

The Gizmo 2 costs $199 and is available from several retailers such as Element14. With Gerber schematics and all the board-focused software open source, it’s no wonder this x86 embedded board is attractive to gamers. With AMD’s G-Series APU onboard, the all-in-one HDK/SDK is an ideal choice for embedded designs—and those future gamers playing with the Gizmo 2 at the UK’s NVA.

BTW: The Who harkened from London, not Nottingham.

New HSA Spec Legitimizes AMD’s CPU+GPU Approach

After nearly 3 years since the formation of the Heterogeneous System Architecture (HSA) Foundation, the consortium releases 1.0 version of the Architecture Spec, Programmer’s Reference Manual, Runtime Specification and a Conformance Plan.

Note: This blog is sponsored by AMD.

HSA banner

 

UPDATE 3/17/15: Added Imagination Technologies as one of the HSA founders. C2

No one doubts the wisdom of AMD’s Accelerated Processing Unit (APU) approach that combines x86 CPU with a Radeon graphic GPU. Afterall, one SoC does it all—makes CPU decisions and drives multiple screens, right?

True. Both AMD’s G-Series and the AMD R-Series do all that, and more. But that misses the point.

In laptops this is how one uses the APU, but in embedded applications—like the IoT of the future that’s increasingly relying on high performance embedded computing (HPEC) at the network’s edge—the GPU functions as a coprocessor. CPU + GPGPU (general purpose graphics processor unit) is a powerful combination of decision-making plus parallel/algorithm processing that does local, at-the-node processing, reducing the burden on the cloud. This, according to AMD, is how the IoT will reach tens of billions of units so quickly.

Trouble is, HPEC programming is difficult. Coding the GPU requires a “ninja programmer”, as quipped AMD’s VP of embedded Scott Aylor during his keynote at this year’s Embedded World Conference in Germany. (Video of the keynote is here.) Worse still, capitalizing on the CPU + GPGPU combination requires passing data between the two architectures which don’t share a unified memory architecture. (It’s not that AMD’s APU couldn’t be designed that way; rather, the processors require different memory architectures for maximum performance. In short: they’re different for a reason.)

AMD’s Scott Aylor giving keynote speech at Embedded World, 2015. His message: some IoT nodes demand high-performance heterogeneous computing at the edge.

AMD’s Scott Aylor giving keynote speech at Embedded World, 2015. His message: some IoT nodes demand high-performance heterogeneous computing at the edge.

AMD realized this limitation years ago and in 2012 catalyzed the HSA Foundation with several companies including ARM, Texas Instruments, Imagination Technology, MediaTek, Qualcomm, Samsung and others. The goal was to create a set of specifications that define heterogeneous hardware architectures but also create an HPEC programming paradigm for CPU, GPU, DSP and other compute elements. Collectively, the goal was to make designing, programming, and power optimizing easy for heterogeneous SoCs (Figure).

Heterogeneous systems architecture (HSA) specifications version 1.0 by the HSA Foundation, March 2015.

The HSA Foundation’s goals are realized by making the coder’s job easier using tools—such as an HSA version LLVM open source compiler—that integrates multiple cores’ ISAs. (Courtesy: HSA Foundation; all rights reserved.) Heterogeneous systems architecture (HSA) specifications version 1.0 by the HSA Foundation, March 2015.

After three years of work, the HSA Foundation just released their specifications at version 1.0:

  • HSA System Architecture Spec: defines H/W, OS requirements, memory model (important!), signaling paradigm, and fault handling.
  • Programmers Reference Guide: essentially a virtual ISA for parallel computing, defines an output format for HSA language compilers.
  • HSA Runtime Spec: is an application library for running HSA applications; defines INIT, user queues, memory management.

With HSA, the magic really does happen under the hood where the devil’s in the details. For example, the HSA version LLVM open source compiler creates a vendor-agnostic HSA intermediate language (HSAIL) that’s essentially a low-level VM. From there, “finalizers” compile into vendor-specific ISAs such as AMD or Qualcomm Snapdragon. It’s at this point that low-level libraries can be added for specific silicon implementations (such as VSIPL for vector math). This programming model uses vendor-specific tools but allows novice programmers to start in C++ but end up with optimized, performance-oriented, and low-power efficient code for the heterogeneous combination of CPU+GPU or DSP.

There are currently 43 companies involved with HSA, 16 universities, and three working groups (and they’re already working on version 1.1). Look at the participants, think of their market positions, and you’ll see they have a vested interest in making this a success.

In AMD’s case, as the only x86 and ARM + GPU APU supplier to the embedded market, the company sees even bigger successes as more embedded applications leverage heterogeneous parallel processing.

One example where HSA could be leveraged, said Phil Rogers, President of the HSA Foundation, is for multi-party video chatting. An HSA-compliant heterogeneous architecture would allow the processors to work in a single (virtual) memory pool and avoid the multiple data set copies—and processor churn—prevalent in current programming models.

With key industry players supporting HSA including AMD, ARM, Imagination Technologies, Samsung, Qualcomm, MediaTek and others, a lot of x86, ARM, and MIPS-based SoCs are likely to be compliant with the specification. That should kick off a bunch of interesting software development leading to a new wave of high performance applications.

PCI Express Switch: the “Power Strip” of IC Design

Need more PCIe channels in your next board design? Add a PCIe switch for more fanout.

Editor’s notes:

1. Despite the fact that Pericom Semiconductor sponsors this particular blog post, your author learns that he actually knows very little about the complexities of PCIe.

2. Blog updated 3-27-14 to correct the link to Pericom P/N PI7C9X2G303EL.

Perhaps you’re like me; power cords everywhere. Anyone who has more than one mobile doodad—from smartphone to iPad to Kindle and beyond—is familiar with the ever-present power strip.

An actual power strip from under my desk. Scary...

An actual power strip from under my desk. Scary…

The power strip is a modern version of the age-old extension cord: it expands one wall socket into three, five or more.  Assuming there’s enough juice (AC amperage) to power it all, the power strip meets our growing hunger for more consumer devices (or rather: their chargers).

 

And so it is with IC design. PCI Express Gen 2 has become the most common interoperable, on-board way to add peripherals such as SATA ports, CODECs, GPUs, WiFi chipsets, USB hubs and even legacy peripherals like UARTs. The wall socket analogy applies here too: most new CPUs, SoCs, MCUs or system controllers lack sufficient PCI Express (PCIe) ports for all the peripheral devices designers need. Plus, as IC geometries shrink, system controllers also have lower drive capability per PCIe port and signals degrade rather quickly.

The solution to these host controller problems is a PCIe switch to increase fanout by adding two, three, or even eight additional PCIe ports with ample per-lane current sourcing capability.

Any Port in a Storm?

While our computers and laptops strangle everything in sight with USB cables, inside those same embedded boxes it’s PCIe as the routing mechanism of choice. Just about any standalone peripheral a system designer could want is available with a PCIe interface. Even esoteric peripherals—such as 4K complex FFT, range-finding, or OFDM algorithm IP blocks—usually come with a PCIe 2.0 interface.

Too bad then that modern device/host controllers are painfully short on PCIe ports. I did a little Googling and found that if you choose an Intel or AMD CPU, you’re in good shape. A 4th Gen Intel Core i7 with Intel 8 Series Chipset has six PCIe 2.0 ports spread across 12 lanes. Wow. Similarly, an AMD A10 APU has four PCIe (1x as x4, or 4x as x1). But these are desktop/laptop processors and they’re not so common in embedded.

AMD’s new G-Series SoC for embedded is an APU with a boatload of peripherals and it’s got only one PCIe Gen 2 port (x4). As for Intel’s new Bay Trail-based Atom processors running the latest red-hot laptop/tablet 2:1’s:  I couldn’t find an external PCIe port on the block diagram.

Similarly…Qualcomm Snapdragon 800? Nvidia Tegra 4 or even the new K1? Datasheets on these devices are closely held for customers only but I found Developer References that point to at best one PCIe port. ARM-based Freescale processors such as the i.MX6, popular in set-top boxes from Comcast and others have one lone PCIe 2.0 port (Figure 1).

What to do if a designer wants to add more PCIe-based stuff?

Figure 1: Freescale i.MX ARM-based CPU is loaded with peripheral I/O, yet has only one PCIe 2.0 port. (Courtesy: Freescale Semiconductor.)

Figure 1: Freescale i.MX ARM-based CPU is loaded with peripheral I/O, yet has only one PCIe 2.0 port. (Courtesy: Freescale Semiconductor.)

‘Mo Fanout

A PCIe switch solves the one-to-many dilemma. Add in a redriver at the Tx and Rx end, and signal integrity problems over long traces and connectors all but disappear. Switches from companies like Pericom come in many flavors, from simple lane switches that are essentially PCIe muxes, to packet switches with intelligent routing functions.

One simple example of a Pericom PCIe switch is the PI7C9X2G303EL. This PCIe 2.0 three port/three lane switch has one x1 Up and two x1 Down and would add two ports to the i.MX6 shown in Figure 1. This particular device, aimed at those low power consumer doodads I mentioned earlier, boasts some advanced power saving modes and consumes under 0.7W.

Hook Me Up

Upon researching this for Pericom, I was surprised to learn of all the nuances and variables to consider with PCIe switches. I won’t cover them here, other than mentioning some of the designer’s challenges: PCIe Gen 1 vs Gen 2, data packet routing, latency, CRC verification (for QoS), TLP layer inspection, auto re-send, and so on.

It seems that PCIe switches seem to come in all flavors, from the simplest “power strip”, to essentially an intelligent router-on-a-chip. And for maximum interoperability, of them need to be compliant to the PCI-SIG specs as verified by a plugfest.

So if you’re an embedded designer, the solution to your PCIe fanout problem is adding a PCI Express switch. 

Intel’s Atom Roadmap Makes Smartphone Headway

After being blasted by users and pundits over the lack of “low power” in the Atom product line, new architecture and design wins show Intel’s making progress.

Intel EVP Dadi Permutter revealing early convertible tablet computer at IDF2012.

Intel EVP Dadi Permutter revealing early convertible tablet computer at IDF2012.

A 10-second Google search on “Intel AND smartphone” reveals endless pundit comments on how Intel hasn’t been winning enough in the low power, smartphone and tablet markets.  Business publications wax endlessly on the need for Intel’s new CEO Brian Krzanich to make major changes in company strategy, direction, and executive management in order to decisively win in the portable market. Indications are that Krzanich is shaking things up, and pronto.

Forecasts by IDC (June 2013) and reported by CNET.com (http://news.cnet.com/8301-1035_3-57588471-94/shipments-of-smartphones-tablets-and-oh-yes-pcs-to-top-1.7b/) peg the PC+smartphone+tablet TAM at 1.7B units by 2014, of which 82 percent (1.4B units, $500M USD) are low power tablets and smart phones. And until recently, I’ve counted only six or so public wins for Intel devices in this market (all based upon the Atom Medfield SoC with Saltwell ISA I wrote about at IDF 2012). Not nearly enough for the company to remain the market leader while capitalizing on its world-leading tri-gate 3D fab technology.

Behold the Atom, Again

Fortunately, things are starting to change quickly. In June, Samsung announced that the Galaxy Tab 3 10.1-inch SKU would be powered by Intel’s Z2560 “Clover Trail+” Atom SoC running at 1.2GHz.  According to PC Magazine, “it’ll be the first Intel Android device released in the U.S.” (http://www.pcmag.com/article2/0,2817,2420726,00.asp)and it complements other Galaxy Tab 3 offerings with competing processors. The 7-inch SKU uses a dual-core Marvell chip running Android 4.1, while the 8-inch SKU uses Samsung’s own Exynos dual-core Cortex-A9 ARM chip running Android 4.2. The Atom Z2560 also runs Android 4.2 on the 10.1-incher. Too bad Intel couldn’t have won all three sockets, especially since Intel’s previous lack of LTE cellular support has been solved by the company’s new XMM 7160 4G LTE chip, and supplemented by new GPS/GNSS silicon and IP from Intel’s ST-Ericsson navigation chip acquisition.

The Z2560 Samsung chose is one of three “Clover Trail+” platform SKUs (Z2760, Z2580, Z2560) formerly known merely as “Cloverview” when the dual-core, Saltwell-based, 32-nm Atom SoCs were leaked in Fall 2012. The Intel alphabet soup starts getting confusing because the Atom roadmap looks like rush hour traffic feeding out of Boston’s Sumner tunnel. It’s being pushed into netbooks (for maybe another quarter or two); value laptops and convertible tablets as standalone CPUs; smartphones and tablets as SoCs; and soon into the data center to compete against ARM’s onslaught there, too.

Clover Trail+ replaces Intel’s Medfield smartphone offering and was announced at February’s MWC 2013. According to Anandtech.com (thank you, guys!) Intel’s aforementioned design wins with Atom used the 32nm Medfield SoC for smartphones. Clover Trail is still at 32nm using the Saltwell microarchitecture but has targeted Windows 8 tablets, while Clover Trail+ targets only smartphones and non-Windows Tablets. That explains the Samsung Galaxy Tab 3 10.1-inch design win. The datasheet for Clover Trail+ is here, and shows a dual-core SoC with multiple video CODECs, integrated 2D/3D graphics, on-board crypto, multiple multimedia engines such as Intel Smart Sound, and it’s optimized for Android and presumably, Intel/Samsung’s very own HTML5-based Tizen OS (Figure 1).

Figure 1: Intel Clover Trail+ block diagram used in the Atom Z2580, Z2560, and Z2520 smartphone SoCs. This is 32nm geometry based upon the Saltwell microarchitecture and replaces the previous Medfield single core SoC. (Courtesy: Intel.)

Figure 1: Intel Clover Trail+ block diagram used in the Atom Z2580, Z2560, and Z2520 smartphone SoCs. This is 32nm geometry based upon the Saltwell microarchitecture and replaces the previous Medfield single core SoC. (Courtesy: Intel.)

I was unable to find meaningful power consumption numbers for Clover Trail+, but it’s 32nm geometry compares favorably to ARM’s Cortex-A15 28nm geometry so Intel should be in the ballpark (vs Medfield’s 45nm). Still, the market wonders if Intel finally has the chops to compete. At least it’s getting much, much closer–especially once the on-board graphics performance gets factored into the picture compared to ARM’s lack thereof (for now).

Silvermont and Bay Trail and…Many More Too Hard to Remember

But Intel knows they’ve got more work to do to compete against Qualcomm’s home-grown Krait ARM-based ISA, some nVidia offerings, and Samsung’s own in-house designs. Atom will soon be moving to 22nm and the next microarchitecture is called Silvermont. Intel is finally putting power curves up on the screen, and at product launch I’m hopeful there will be actual Watt numbers shown, too.

For example, Intel is showing off Silvermont’s “industry-leading performance-per-Watt efficiency” (Figure 2). Press data from Intel says the architecture will offer 3x peak performance, or 5x lower power compared to the Clover Trail+ Saltwell microarchitecture. More code names to track: the quad-core Bay Trail SoC for 2013 holiday tablets; Merrifield with increased performance and battery life; and finally Avoton that provides 64-bit energy efficiency for micro servers and boasts ECC, Intel VT and possibly vPro and other security features. Avoton will go head-to-head with ARM in the data center where Intel can’t afford to lose any ground.

Figure 2: The 22nm Atom microarchitecture called Silvermont will appear in Bay Trail, Avoton and other future Atom SoCs from "Device to Data Center", says Intel. (Courtesy: Intel.)

Figure 2: The 22nm Atom microarchitecture called Silvermont will appear in Bay Trail, Avoton and other future Atom SoCs from “Device to Data Center”, says Intel. (Courtesy: Intel.)

Oh Yeah? Who’s Faster Now?

As Intel steps up its game because it has to win or else, the competition is not sitting still. ARM licensees have begun shipping big.LITTLE SoCs, and the company has announced new graphics, DSP, and mid-range cores. (Read Jeff Bier and BDTi’s excellent recent ARM roadmap overview here.)

A recent report by ABI Research (June 2013) tantalized (or more appropriately galvanized) the embedded and smartphone markets with the headline “Intel Apps Processor Outperforms NVIDA, Qualcomm, Samsung”. In comparison tests, ABI Research VP of engineering Jim Mielke noted that that Intel Atom Z2580  ”not only outperformed the competition in performance but it did so with up to half the current drain.”

The embedded market didn’t necessarily agree with the results, and UBM Tech/EETimes published extensive readers’ comments with colorful opinions.  On a more objective note, Qualcomm launched its own salvo as we went to press, predicting “you’ll see a whole bunch of tablets based upon the Snapdragon 800 in the market this year,” said Raj Talluri, SVP at Qualcomm, as reported by Bloomberg Businessweek.

Qualcomm  has made its Snapdragon product line more user-friendly and appears to be readying the line for general embedded market sales in Snapdragon 200, 400, 600, and “premium” 800 SKU versions. The company has made available development tools (mydragonboard.org/dev-tools) and is selling COM-like Dragonboard modules through partners such as Intrinsyc.

Intel Still Inside

It’s looking like a sure thing that Intel will finally have competitive silicon to challenge ARM-based SoCs in the market that really matters: mobile, portable, and handheld. 22nm Atom offerings are getting power-competitive, and the game will change to an overall system integration and software efficiency exercise.

Intel has for the past five years been emphasizing a holistic all-system view of power and performance. Their work with Microsoft has wrung out inefficiencies in Windows and capitalizes on microarchitecture advantages in desktop Ivy Bridge and Haswell CPUs. Security is becoming important in all markets, and Intel is already there with built-in hardware, firmware, and software (through McAfee and Wind River) advantages. So too has the company radically improved graphics performance in Haswell and Clover Trail+ Atom SoCs…maybe not to the level of AMD’s APUs, but absolutely competitive with most ARM-based competitors.

And finally, Intel has hedged its bets in Android and HTML5. They are on record as writing more Android code (for and with Google) than any other company, and they’ve migrated past MeeGo failures to the might-be-successful HTML5-based Tizen OS which Samsung is using in select handsets.

As I’ve said many times, Intel may be slow to get it…but it’s never good to bet against them in the long run. We’ll have to see how this plays out.