A data processing unit (DPU) is a specialized processor designed to offload and accelerate data-centric tasks, freeing up central processing units (CPUs) to process application-specific workloads. Designed to process high speed networking, storage requests, and security processing, DPUs are suited for modern high-density data centers and high-performance computing (HPC) demands.
DPUs and their counterparts, infrastructure processing units (IPUs), answer a requirement to offload common and throughput-intensive tasks from CPUs. Reducing encryption tasks, storage I/O operations, and high-bandwidth networking packet processing allows CPUs to target higher density application tasks that container-based applications, cloud or hypervisor partitioning, and artificial intelligence (AI) compute intensive tasks require.
Several key functions include:
Optimizing CPU performance for application-specific tasks in HCI and HPC environments is increasingly important as compute density and power usage become new metrics for infrastructure cost benefits. Advances in networking speeds and latency reduction, storage performance, and the need to provide compute resources to more users further tax the non-application specific tasks required of CPUs. The currently accepted measurements of success, adopted from the HPC industry, are defined by CPU density and performance.
The ratios of processing power datapoints include but are not limited to:
Long used by HPCs to measure supercomputer performance at launch and over time, these measurements are increasingly being applied to traditional data centers as the technology between the two industries continues to converge.
DPUs provide a way to increase the CPU availability for application and compute intensive pipelines, which can bottleneck if the CPU is required to handle lower-level non-compute tasks. These tasks are compounded when densities and application tasks are increased, so DPUs provide a way to alleviate this bottleneck. By adding DPUs to data center infrastructure, CPUs are freed up to provide better per-core performance. Alternately, compute resources can be partitioned and tenanted to allow more users access to system resources.
Building on its success utilizing SmartNICs, ASIC, and FPGA technologies, F5 takes advantage of the processing and inline traffic location of a DPU within the compute infrastructure to increase and improve the workload capacity, performance, and security of HCI/HPC infrastructures.
Taking advantage of NVIDIA Bluefield-3 DPUs, F5 offers multiple benefits for service providers and large enterprises looking to build out large-scale compute resources while maximizing compute resources. These include but are not limited to:
For more information on DPU and F5 integrated solutions please click on the resources on the right.
1Standard measurements for scientific-based HPC measurements traditionally consisted of single or double precision floating point precision (FP32 and FP64). Current AI trends now measure performance at half or lower (FP16). The use of smaller precision memory addressing (floating point and integer data types) allows for faster training and smaller memory footprints of language models.