At Fungible we’ve noted the confluence of several important technology trends that have been in play for some time. To us it’s clear that large-scale data centers are reaching a critical inflection point, due to three key factors:
- Moore’s Law limits are clearly visible. The historical 2x improvement every 18 to 24 months in the performance of compute, storage, and networking have slowed dramatically.
- Data is exploding. The amount of data we generate doubles every year; by 2020 we will have produced 44 trillion gigabytes – roughly one data bit for every star in the universe.
- AI is becoming ubiquitous. According to Gartner, enterprise adoption of machine learning and other data-hungry applications has tripled in the last year.
The slowing of Moore’s Law, the explosion of data, and rapid growth of AI have resulted in the need for new specialized silicon engines dedicated to modern data hungry applications.
In response the industry has begun to move from data centers built using scale-out homogeneous servers built using x86 CPUs only, to data centers built using scale-out heterogeneous servers that include x86 CPUs, graphical processing units (GPUs), and field programmable gate arrays (FPGAs). These additional silicon engines are used to accelerate specific application workloads.
Yet while modern applications are increasingly data-centric, modern data center architecture remains stubbornly compute-centric – x86 CPUs sit at the center of every server, mediating all IO to and from the network and from the specialized silicon engines mentioned above.
The problem? Relying on the CPU to play traffic cop in your data center is inefficient and expensive. Between one third and half of a general-purpose CPU’s processing power is consumed for shuttling data between storage, specialized compute elements and the network. This ‘data center tax‘ results in infrastructure that is both over provisioned and under-performing. According to surveys by IDC, on average less than half of enterprise data center infrastructure is fully utilized. Network link utilization is frequently even much lower.
What the industry needs is a new category of processor that is fully programmable: the Data Processing Unit (DPU).
A DPU would sit between the network fabric and compute/storage elements, and handle data centric workloads such as data transfer, data reduction, data security, data durability, data filtering and analytics — all functions that general-purpose CPUs are not very good at. A DPU should execute these workloads an order of magnitude more efficiently than general-purpose silicon, and do it at much lower cost. A DPU should also enable a standards-based IPoE low-latency non-drop fabric which scales from a single rack to thousands of racks. To minimize disruption to existing infrastructure while inserting the DPU, the DPU should impose no changes to application software, no changes to server packaging and no changes to the network.
A data center built using data engines such as DPUs, will ultimately prove to be simpler and less costly to manage. There will be less need for different variants of servers; storage and compute resources could be pooled and assigned dynamically as workload needs change. Congestion and packet loss will be minimized, allowing data center operators to more fully utilize their available link bandwidth without the need for over-provisioning. Data center operators will no longer have to make trade-offs between reliability, agility, cost and performance.
I strongly believe that the DPU provides the industry’s most effective solution to these problems. In fact, we predict in the next five years, a large percentage of servers will be data-centric, using a DPU as the transport hub between a server’s processors, storage, and network fabric. Complex, poorly utilized, multi-tier networks will be transformed into simple flat networks that are faster, more secure, and more reliable.