ReRAM and AI: Q&A with Sylvain Dubois, Crossbar
Autonomous driving is just one of the applications hungry for processing at the edge, giving embedded memory growing strategic importance.
Editor’s Note: “The boundary between data and compute is really blurring now,” contends Sylvain Dubois. The vice president of strategic marketing and business development at ReRAM technology company Crossbar also explains why putting data and computing on the same chip is making more and more sense. I spoke with Dubois in May, shortly before Crossbar unveiled its collaboration with Microsemi. Microsemi products manufactured at the 1x nm process node will integrate Crossbar’s embedded ReRAM technology.
Crossbar ReRAM is enabling a new range of energy-efficient computing architectures compared to legacy SRAM or DRAM-based architectures.
EECatalog: Across AI, networking, computing, we’re seeing an increasing demand for embedded nonvolatile memory [NVM].
Sylvain Dubois, Crossbar: Yes, embedded memory is of strategic importance for CMOS foundries, and if you go to all of the top foundries’ Technology Symposiums such as TSMC, Global Foundries, UMC, Samsung, SMIC, they are all looking for ways to have access to embedded NVM (Non Volatile Memory) technologies: Flash all the way to 40 nm and then MRAM and ReRAM for 2x nm and 1x nm.
EECatalog: How is what Crossbar and Microsemi will be doing—integrating embedded ReRAM at 1x nm—going to make a difference for OEMs and developers? Can you describe how a use case would change?
Dubois, Crossbar: A typical use case would involve bringing more computing power to the edge. More processing done locally, this includes wearables and hand-held devices, surveillance cameras and autonomous driving for example. And that brings up the whole topic of AI [Artificial Intelligence] inference at the edge, where you are not necessarily training the AI algorithms in the field but instead using the trained model so that the devices at the edge can recognize a face, a traffic sign. Crossbar’s ReRAM technology will make a difference with any pattern recognition task such as object or face detection. It’s what we demonstrated at the Embedded Vision Summit, showing how you can bring embedded ReRAM and neural networks together in a one-chip solution to make very low energy computing devices.
Today, what people are doing is trying to store the AI inference model, the weights and features of the neural network in the internal SRAM buffers on the chip. Because SRAM is not a dense memory; it won’t be big enough and the models will be partially stored in external DRAM banks that are very expensive and also very power hungry. Both SRAM and DRAM are volatile memory, meaning that they lose their content when powered down. This requires an additional layer of flash memory required to store the model when power is off.
But now with embedded ReRAM you have the capability to store the entire trained model of the neural network directly on chip. ReRAM retains its content for 10 years even when not powered, this eliminates the need for an external flash memory back-up and enables new use-models where the end-device can be frequently powered down and up to extend the battery life.
What we have designed is a specific memory array—a very wide memory array—with some amount of in-memory computing—pattern recognition, distance computation logic blocks. At the Embedded Vision Summit, we implemented a facial recognition demo showing a classification of a new face across a huge database of other faces in only once iteration.
EECatalog: How did the demonstration turn out?
Dubois, Crossbar: It was very well received as this classification task or comparison of one input across a huge database of objects usually takes a lot of time and power. The value proposition here is that the comparison of one input across a huge database will be extremely deterministic, it always takes the same amount of time whatever the size of the database from very few pictures to 100,000 pictures. The computation is done in only one iteration, few clock cycles.
EECatalog: How does that use case look if it’s not being accomplished with ReRAM?
Dubois, Crossbar: Today, if you want to do the same use case with embedded SRAM and external or stacked DRAMs and GPUs, it will be done in a serial manner, where the larger the database is, the longer time it will take, because you have to compare against all these multiple pictures of objects stored in the memory.
We provide a very energy-efficient way—because ReRAM is on-chip and non-volatile —to perform classification of objects, patterns, with fast and deterministic latencies, consuming less energy than SRAM/DRAM memories.
And it’s also very secure. Privacy matters when the database includes not only your face but also your biometrics and vocal commands. You don’t want the whole conversation in your living room to be processed in the cloud and potentially hackable by malware. Biometrics identification, speech recognition and classification of objects from surveillance cameras are typical use cases for energy-efficient computing and memory on the same chip.
EECatalog: One of the big picture issues here ability to anticipate that next advanced process node and scale to it.
Dubois, Crossbar: Yes, it is important to pick a memory technology that scales because most of these AI chips or advanced SoCs, or microcontrollers, are currently designed at 22nm, 14nm, 12nm or even below 7 nm.
Crossbar ReRAM cells are programmed with a very low voltage across two electrodes causing the metal ions of the top electrode to move and thereby creating an extremely short narrow filament (3 or 4 nm). Growth of this metallic wire forms a conductive path, enabling a very low-resistance state. The ON current that is going through the filament determines the logic 1 state. When you want to have a logic 0 state, we just reverse the electric field so that the metal ions are pulled back to the top electrode, creating a high-resistance state, almost an open circuit.
Based on the metal filament physics that we grow and remove, the difference between ON and OFF current is extremely high, more than 1000X difference, providing great read margins and reliability to the ReRAM technology at the most advanced process nodes. As the filament is just 3 nm, going below 10 nm is something definitely possible with Crossbar ReRAM technology.
The ReRAM cell is so small that it can fit in between the metal routing layers of standard CMOS wafers. This is the reason why we can have a breakthrough architecture with millions of connection points between the logic and the memory compared to maximum thousands of connections with stacked DRAMs. It is a truly monolithic integration of embedded ReRAM and logic in the same silicon.
EECatalog: Anything to add before we wrap up?
Dubois, Crossbar: The boundary between data and compute is really blurring now. Algorithms trained with lots of data and devices are now self-sufficient to perform object identification and pattern recognition with a minimum power budget. Crossbar ReRAM is enabling a new range of energy-efficient computing architectures compared to legacy SRAM or DRAM-based architectures. Crossbar is working with multiple partners to create innovative architectures where data and processing are integrated on same silicon chip.
For edge computing in hand-held mobile devices or home appliances, or cloud computing in data centers, people are starting to realize that they can cut their energy bill quite drastically by putting the data and the computing in the same chip. Most of the system companies are now expanding their strategies towards vertical integration of their business all the way to the chip manufacturing as it makes a lot of sense for a great differentiation.