Memory Class Storage for Embedded Applications
Why a DRAM replacement with non-volatility bodes well for the memory roadmap’s future
All indications point to a dramatic shift in the availability of new memory architectures, making the next couple of years perhaps the most dramatic in many decades. Articles seem to pop up daily on the emerging memory options, from PCM to ReRAM to MRAM to 3DXpoint. While each of these has its unique characteristics, they have one thing in common that distinguishes them from DRAM: non-volatility. The industry has introduced the term “Storage Class Memory” or SCM to encapsulate these various technologies, however lumping all emerging technologies into a single category does the industry a bit of a disservice. None of these SCMs provides a feature set even close to 100% compatibility with a DRAM standard interface, and they are in many ways more competition for Flash than DRAM. There is a need for an additional standard phrase for DRAM replacement technologies that can offer 100% compatibility with DRAM while still providing persistent memory, which we call “Memory Class Storage.”
The Deterministic DRAM Interface
The DRAM interface is quite demanding. Characteristics of the DRAM interface include reasonably high frequency, currently around 1600 MHz for a 3200 Mbps data rate, coupled with a short access time of roughly 15 ns and an overall cycle time around 45 ns. While there is some flexibility in these numbers, the range of acceptable solutions tends to be pretty tight, within a few nanoseconds of these values. Having been designed around the electrical characteristics of the embedded transistor plus capacitor memory cell of a DRAM, the key factor is the deterministic requirement that reads and writes will deliver good data in the access time and complete background bookkeeping in the cycle time.
This determinism is the primary difference between a Storage Class Memory (SCM) and Memory Class Storage (MCS). In order for a device to directly replace a DRAM, it needs to meet all the deterministic requirements of a DRAM. While the universe of Storage Class Memories offers a variety of features, none have all the required features of DRAM. FeRAM comes close with its high write endurance ability, but it suffers from access time limitations, limited scalability, and very high cost. ReRAM comes close in terms of cost, but its slow access time and especially its limited write endurance prevent deterministic operation. MRAM shows promise in terms of access time. But MRAM will have challenges meeting the power envelope of a DRAM, and of course with converging on DRAM-like pricing.
Enabling Non-Deterministic Memories
There are some efforts under way to change the protocol to allow non-deterministic memories to share the DRAM bus. 3DXpoint requires some proprietary hooks to allow it to coexist with DRAM, and these tricks offset some of the performance penalties associated with long write times and low write endurance, which require the 3DXpoint to go offline periodically for maintenance such as wear leveling. The NVDIMM-P protocol is another effort to allow non-determinism on the DRAM channel, however NVDIMM-P requires an expensive, complex centralized controller to filter all data traffic and break it into read and write credit packets that may be serviced in or out of order.
The main takeaway from this line of analysis is that none are a DRAM replacement. The tuned performance of the native DRAM interface is fundamentally faster than the non-deterministic alternatives. All SCMs require these compromises in order to populate the DRAM channel, but these compromises also affect system architecture, and especially software, in some very unpleasant ways.
The big problem with incorporating the SCM protocols into a system designed for DRAM is that the performance of the subsystem becomes very asymmetric. For all its other problems, DRAM at least has a very predictable nature, but all that goes away with SCM protocols. Individual processor threads get stalled whenever an SCM goes into housekeeping mode, and overall system performance is reduced. Software must be rewritten to comprehend such asymmetry, and few companies are willing to change billions of lines of tested code to tune for an asymmetric memory subsystem.
While some systems will take advantage of direct access modes, such as DAX, a majority of systems are mounting SCM not as main memory but as disk storage. Since legacy software already prepared for potentially long delays going through the operating system disk drivers, performing an access to a non-deterministic SCM is much faster than a disk drive and they see some performance enhancement.
Memory Class Storage
Introducing Memory Class Storage changes the math again. Simply put, MCS is a DRAM replacement with non-volatility. To be considered MCS, such a technology must meet all DRAM timings in a fully deterministic way. MCS must have essentially unlimited write endurance to meet this requirement so that it never goes offline for maintenance.
Nantero NRAM™ is the world’s only Memory Class Storage device. NRAM meets all DDR4 timings and provides the required write endurance for truly deterministic performance. The power profile for NRAM is lower than that of DRAM, and it uses the same supply voltages for a true drop-in replacement memory.
The system may simply execute all functions with NRAM as though DRAM were installed. This allows for extremely simple integration into systems through true plug and play. However, this also leaves significant headroom on the table. Since NRAM is inherently persistent, all refresh operations that are required by a DRAM are no longer required. This alone can improve memory performance by 15% even at the same clock frequency. Similarly, NRAM has an inherently non-destructive read, therefore the DRAM precharge operation is not required, making additional command slots available to the controller for other uses.
MCS is More Than Just Another Memory
Memory Class Storage enables a whole new category of computing as well. Memory modules with MCS can directly replace NVDIMM-N memories, eliminating the need for bulky, expensive, and unreliable battery or supercapacitor backup hardware, and nullifying the time-consuming backup and restore procedures from power failure. With an MCS memory module, power can be removed at any time without affecting the validity of the memory content. When power is restored, system operation can continue from where it left off.
From the system software perspective, MCS as main memory relieves all concerns about performance asymmetry. Data persistence is inherent in all operations, thereby eliminating the need to partition performance critical operations through the disk drive interface. Direct memory access becomes the normal mode of operation, increasing performance hundreds to thousands fold.
MCS in the DDR5 Age
Looking ahead to DDR5, there is a dark cloud on the horizon for DRAM in that it is planned to max out at 32 Gb per device just about half way into the DDR5 life cycle. At the same time, advances in embedded systems for artificial intelligence, deep learning, and similar applications are demanding more memory. NRAM can use a crosspoint architecture or a 1T-1R structure, both of which are far more space efficient than the transistor and capacitor combination of a DRAM cell. As a result, DDR5 NRAM will deliver at least 8 to 16 times the per-device capacity of DDR5 DRAM, or 256 Gb to 512 Gb, and at higher performance than an SDRAM.
Memory Class Storage may not displace Storage Class Memory, but instead may coexist. The non-deterministic protocols of 3DXpoint and NVDIMM-P will allow slower media to offer very high per-module storage capacity, similar to an SSD. Memory Class Storage, however, will offer the highest performance for main memory along with non-volatility that eliminates the need to back up main memory in case of power failure.
DRAM fades from memory (if you’ll forgive the pun) after 32 Gb. Fortunately, Memory Class Storage is coming to extend the main memory roadmap into the future.
Mr. Gervasi is Principal Systems Architect at Nantero, Inc. He has been working with memory devices and subsystems since 1Kb DRAM and EPROM were the leading edge of technology. He has been a JEDEC chairman since 1996 and responsible for key introductions including DDR SDRAM, the integrated Registering Clock Driver and RDIMM architecture, the formation of the JEDEC committee on SSDs, and actively involved in the definition of NVDIMM protocols.