In 1990, it cost roughly $9,000 to store 1 gigabyte (GB) of data; today, it costs less than 3 cents. Over the last decade, it has become normal to assume that the cost of storage is so negligible it’s “virtually free.” But even though storage is almost free to users, data center operators are still spending billions on it every year – and that cost is only going to increase, spurred not only by the explosion in the amount of data created each year but also by more and more stringent durability and availability requirements.
According to Cisco’s Global Cloud Index, worldwide data center storage capacity will grow to 2.6 zettabytes (ZB) by 2021, up from 663 exabytes (EB) in 2016. That’s a growth rate of roughly 400%. More than half of that storage will be consumed on hard drives, and about a quarter on SSD, per IDC’s Global Datasphere report.
“Virtually free” storage is in fact an expensive line item in a data center budget.
The nature of data is changing
In the not-so-distant past, data centers were filled with storage to support applications that ran on servers, data was written to disk and often, that data was rarely accessed again.
But with today’s modern applications, the world is a very different place:
- Microservices deployed in a scale-out fashion are replacing monolithic applications.
- Data volume is huge and data movement between nodes is increasing.
- Services require high throughput and low latency storage at scale.
- The overall data temperature is rising – i.e. the volume of real-time hot data is increasing.
Organizations are under pressure to cope with these needs and, at the same time, drive down costs.
Data Reduction: Innovations in compression algorithms
This is why we’ve seen the emergence of next-generation compression solutions. For text/binary data, compression algorithms such as Facebook’s Zstandard, Google’s Brotli, and Microsoft Project Zipline all offer compression ratios higher than standard deflate-based algorithms. Moreover, more than 50% of data in cloud storage today consists of pictures and videos. These compression algorithms do not compress JPEG or MPEG files at all. One approach that cloud vendors have taken is to introduce a category of lossy compression algorithms for images that can save 20% to 30% on storage, such as Google’s Guetzli. Another approach that Dropbox has taken is to deploy Lepton, a lossless compression algorithm for JPEG which saves up to 22% but can only achieve compression throughput of 40Mbps.
Even a small improvement in compression ratios results in huge cost savings in storage and network bandwidth. These savings easily outweigh the additional cost in CPU cycles and power/cooling required to run the compression algorithms. Unfortunately, each of these schemes also come with a trade-off in terms of performance. The greater the compression ratio, the slower the throughput.
Due to the throughput constraints, these algorithms are typically applied to data at rest, not data in motion. To fully maximize cost reductions by using compression on data in motion as well, we must be able to sustain throughput at line rates.
Data Durability and Availability: Replication vs. Erasure Coding
Today’s data centers demand many 9s of durability and availability. Data replication (or mirroring) is one of the most basic ways to offer durability and availability. This scheme makes identical copies of data and stores them in different failure domains. The compute requirement to replicate data is relatively small and this scheme offers the fastest recovery time. However, replication results in higher storage costs, as it is not uncommon for data to be replicated two times or more.
Parity encoding is another well-known scheme to provide durability and availability at a much lower storage overhead. An example of a parity encoding scheme is erasure coding, where multiple data and parity fragments are distributed across different failure domains. The number of parity fragments determines the durability factor. Erasure coding schemes require low storage capacity overhead but have higher compute and networking requirements, especially when having to reconstruct data from different locations in the event of non-availability. Thus, compute processing throughput and low network latencies are key requirements to successfully implement erasure coding.
The most common practice today is to replicate data in real time (i.e., in the context of read and write commands), but lazily erasure code data at rest. This is because current solutions cannot support erasure coding in the read and write path at latencies that are acceptable to applications.
Resource pooling at massive scale
Another way to lower storage cost is to improve capacity utilization. This can be done by pooling storage resources into dynamically allocated virtual pools, which can be accessed by many clients. In his PhD thesis, Peter J. Denning showed that combining N separate pools of a resource of 1 unit each into a single pool provides the same service level with just √N units of the resource instead of N units. In other words, the larger the shared pool, the more significant the storage savings.
Today, while resource pooling can be done in hyperconverged infrastructure (HCI), access to direct-attached storage SSDs is still constrained by CPU bottlenecks. High, unpredictable latencies through the CPUs result in complex software, ultimately limiting performance and scale. Resource pooling can be much better realized through a disaggregated infrastructure where compute and storage elements are physically located in different servers. By decoupling storage from compute, CPU bottlenecks are reduced and latencies become more uniform, allowing data placement considerations to be simplified.
At Fungible, we believe that a disaggregated storage architecture is a natural fit to implement (i) parity schemes such as erasure coding, enabling distribution of data and parity codes across different failure domains, and (ii) large-scale shared storage pools.
However, up until now, disaggregated storage has not achieved its full potential due to CPU inefficiencies, fabric performance, legacy software limitations, and so on.
Fungible’s Data Processing Unit (DPU)
To break free of these limitations, Fungible has defined and designed a new class of programmable microprocessor known as the Data Processing Unit. The DPU is purpose-built from ground up to not only keep storage costs in check, but also to provide the performance and scalability that is lacking in today’s compute-centric architectures.
The DPU was designed with the following principles in mind:
- Compression ratio and throughput need not be a trade-off consideration. Compression algorithms must be lossless for text/binary as well as for images.
- Data durability using erasure coding schemes must be supported at the throughput and latencies required by modern applications in the context of reads and writes.
- Resource pooling must be supported at the throughput and latencies required by modern applications and must be achievable at massive scale across the network.
Storage may never be free, but it can be so much cheaper with Fungible’s DPU solution.