Compute-Storage Proximity: When Storage at the Edge Makes More Sense
Why it’s time to take a new look at high-performance storage at the edge.
When is localized storage more viable than a cloud solution? It’s a considerable debate for many organizations, and requires a thoughtful, case-by case examination of how stored data is used, secured and managed. Cloud migration may or may not be ideal, depending on considerations such as data accessibility, control and ownership. Data-intensive applications close to the source have great impact on how and why storage choices are made — considerations include mission-critical data sensitive to latency and bandwidth, as well as compliance and privacy issues.
A healthy dose of skepticism about what can be shipped to a public cloud, or even kept in a hybrid environment, must be balanced with a better understanding of how on-premise options have evolved. High-performance storage close to the compute source is not only relevant, but also much less complex and costly than it once was, thanks in large part to the emergence of software-defined storage. For system engineers, all these factors are driving the need for a smarter, more strategic look at storage options.
Storage at the Edge Competes on Cost and Complexity
High capacity storage is often pushed to the cloud because of a cost advantage. However, today that may be more of a perception than a reality—the landscape is changing dramatically with the advent of software-defined storage or hyperconvergence. The software-defined architecture hides the complexities of managing conventional storage and therefore helps reduce costs. Users can deploy and manage such systems themselves, reducing the need for a well-qualified IT resource dedicated to full-time management of terabytes or petabytes of data. This removes a significant barrier to deployment, given that historically as much as 70 percent of TCO would have been related to staff and physical labor required to maintain conventional storage on-site.
Even more compelling is the way software-defined storage eliminates the focus on the underlying storage hardware itself. System engineers must only define parameters that optimize their application, such as storage requirements and response time. The software then allocates existing resources, determining application needs and where best to store data. This may include any combination of disk types, such as spinning drives, solid-state or flash, or even tape. Adding nodes or replacing SSDs has no impact on the upper level application; with redundancy built in, the system will automatically switch to other resources without disrupting performance.
This is a critical advancement—in the past, system engineers would have had to ensure system or application code was configured to rely on a particular type of storage. Without a strong understanding of storage technologies, systems could become unbalanced based on under- or over-provisioned hardware assets. Now, software-defined advancements allow optimized, scalable storage solutions; this enables engineers to consider how they want to manage storage, rather than which individual pieces of hardware they need.
Defining Performance is Step One
To determine storage requirements and how they can be ideally managed, system engineers must of course understand the performance needs of their application. This sounds obvious, but is not necessarily an easy thing to accomplish. As storage system engineers try to create an optimized storage solution—a daunting task unto itself—primary performance factors need to be evaluated one by one. For example, storage capacity represents a benchmark, but it is really only one piece of the puzzle. How the solution performs in terms of latency, throughput, data integrity and reliability, are equally as critical as raw capacity.
Consider an application such as training and simulation, by definition intended to mirror a real-life training scenario through high-resolution video and graphics. Or an even more critical application such as real-time situational awareness, where decisions are made quickly based on accurate and timely information. Protecting against latency in these settings is a central concern, yet can be jeopardized without assessing the need for compute-storage proximity. Data that resides farther from its compute platform simply takes longer to arrive and feed the system; this model may not be reliable enough to ensure users get a fully responsive experience given the unpredictable nature of remote network connections. When the potential for delay can’t be tolerated by such applications, localized storage becomes the option of choice.
Considering Mission-Critical Data
The same type of assessment must be applied to a slate of other considerations, such as the need for control and ready access to stored data, or enhanced data privacy and security. Cloud hosting inherently creates a lack of control, and also relies on network availability and bandwidth which could be deemed a single point of failure. These may be unacceptable considerations if data is mission critical or impacts revenue. In other scenarios, for example applications such as healthcare and financial services, data environments are mandated to meet specific regulatory compliance requirements. Here a localized storage solution will more readily demonstrate the required data security and access control, helping engineers gain compliance ratings and customer confidence.
Because cloud computing is built on a virtualized platform, the actual place where data is stored or is in motion is also at times difficult to identify and trace. Despite cloud security advancements, security threats or data spills can be better managed with local data behind the fire-wall. Some data types are even prevented from crossing geographic borders, ruled by regulations meant to address data protection and ownership.
Insights at the Edge
The IoT’s growth has operational data being generated at an exponential scale, offering a new kind of business value enabled through real-time analytics. Uncovering customer usage models or equipment maintenance requirements—these and other crucial business insights largely depend on the speed with which data can be collected, analyzed and acted upon. Yet even with compression, big data is everywhere, with applications such as genome sequencers or even high-definition cameras producing a few megabits of data every second. In these industrial and other mission-critical applications, it may not make sense to move every piece of data to the cloud. Transporting such large data sets requires bandwidth with high QoS, racking up unnecessary costs quickly.
Instead, a localized storage strategy efficiently supports computationally intensive operations performed at the edge. Only analytics results are shared to the cloud, rather than moving all the data under costly bandwidth requirements.
A Changing Perspective
With the emergence of software-defined storage, the on-site vs cloud perspective has shifted. System engineers can now focus on how they want to serve clients and meet service level agreement (SLA) requirements, rather than on the storage and underlying hardware itself. Offload application data to the cloud and trust that it is going to work? Or keep it close at hand, hide the complexity and retain greater control? The answer lies in careful evaluation of your specific application’s needs for accessibility, control, data ownership, security and more. Compelling, cost-effective advantages can be gained with storage close to the source—such as reducing maintenance staff, improving data reliability and security, and addressing physical challenges of latency and bandwidth in transmitting data. For meaningful, time-sensitive analytics, it is vital that both the application and high performance storage capacity are in close proximity.
Organizations are finding it difficult to store and manage growing volumes of data. It’s a rising challenge with great impact on embedded design, given that embedded systems are generally where the data is being generated. Yet moving everything to the cloud may not be the only answer. It will be many years before we come close to knowing if that is even possible. The more likely reality is that the need for high-performance, localized storage will increase in step.
In traditional, conventional storage, a person or an application needs to be aware of all the specific hardware components. In the simplest terms, software-defined storage is a layer of abstraction that hides the complexity of the underlying compute, storage, and in some cases networking technologies.
In the software-defined model, storage systems are virtualized, pooled, aggregated and delivered as a software service to users. An organization then has a storage pool, created from readily available COTS storage components that offer longer life cycle by lowering OpEx and TCO over time. Software-defined systems even enable the creation and sharing of storage pools from the storage that is directly attached to servers. The storage management software may add further value by hiding the underlying complexity of managing high performance and scalable storage solutions. Some also provide open APIs, enabling integration with third-party management tools and custom-built applications.
Evolution continues, and within the software-defined storage market, there’s a general movement called “software-defined” everything. For example, there is software-defined networking or software-defined virtual function. Hyperconvergence is a follow-on trend, which essentially converges compute, storage, virtualization, networking and bandwidth onto a single platform and defines it in a software context. The software handles all underlying complexity, leaving only simple tasks for administrators to manage, and for clients to be served in a transparent and highly efficient manner.
Bilal Khan [firstname.lastname@example.org] is Chief Technology Officer, Dedicated Computing. Khan spearheads technology innovation, product development and engineering, and deployment of the company’s systems and services strategy. Dedicated Computing supplies embedded hardware and software solutions optimized for high-performance applications, with expertise in data storage, embedded design, security tools and services, software stack optimization and consulting, and cloud business infrastructure design and management.