BLOG

F5 Helps Service Providers and Enterprises Unlock Full Potential of AI Deployments with NVIDIA BlueField-3 DPUs

 Miniatura
Published October 24, 2024

Over the past decades, the business world has faced many inflection points spurred by revolutions in technology, and F5 has been there to help our customers through these critical junctures.

When organizations began to embark on their digital transformations, applications became the heart of the business, and F5 made sure they could be delivered and secured at scale. More recently, when 5G promised to revolutionize the business world with unprecedented speeds, services and reliability, F5 was there to help mobile companies deploy cloud-native 5G core at scale.

Now, once again, we’re at an inflection point, probably the biggest our industry has faced, as organizations look for ways to embrace the power of AI. As customers implement this transformative technology, F5 is helping them unlock the full potential of their large-scale AI deployments.

The difficulty of achieving optimal performance

The increasing adoption of AI clusters is driving the transformation toward accelerated computing. Trying to use established practices in general-purpose computing, networking, security, and monitoring often results in inefficiencies, delays, and rising costs.

The immense data processing requirements of AI place considerable strain on traditional network infrastructure, making it difficult to maintain optimal performance. The NVIDIA BlueField data processing unit (DPU) has emerged as a key solution. By offloading and accelerating high-bandwidth network and security tasks—such as packet processing, encryption, and compression—BlueField-3 DPUs deliver optimal cloud network connectivity. This optimization enhances overall performance and accelerates graphics processing unit (GPU) access to data.

Service providers and large enterprises are building out large-scale AI infrastructures or AI factories, using NVIDIA’s full-stack accelerated computing platform to perform generative AI model training and inferencing at scale. Businesses need to maximize their investments in AI factories, which can be significant. Yet without the right foundation, AI infrastructure can be underutilized. 

Efficiently managing the vast traffic directed to AI servers

F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs is designed to address these issues. The solution focuses on offloading and accelerating F5 BIG-IP Next Service Proxy for Kubernetes (SPK) on NVIDIA’s BlueField-3 DPUs. It builds on F5's leadership in addressing critical application delivery and security challenges during key market inflections, while leveraging NVIDIA’s innovations in accelerated computing and high-performance networking. 

F5 BIG-IP Next SPK was developed to solve the problems service providers faced with Kubernetes as they transitioned to 5G. 5G infrastructure is built on a cloud-native containerized architecture, with container workloads managed using Kubernetes. Yet, Kubernetes wasn’t originally intended for the complex use cases required of a 5G environment. BIG-IP Next SPK helped telcos tailor Kubernetes networking for a 5G infrastructure, giving them the visibility, control, and security they needed to dynamically scale their 5G networks. Over the past several years, service providers have used BIG-IP to bring 5G technology to life for millions of subscribers.  

Just as BIG-IP Next SPK played a pivotal role in enabling 5G Core for the last market inflection, it’s evolving now to address the challenges of the AI market inflection and AI workload delivery, which share similarities with 5G workloads, but involve exponentially greater traffic volumes. To meet the demands of this new market inflection, F5 is releasing BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs to effectively manage the vast traffic directed to AI servers.

This solution transforms modern application delivery to meet the demands of generative AI. It's a Kubernetes-native implementation of F5's BIG-IP platform that handles networking, security, and load balancing workloads, sitting on the demarcation point between the AI cluster and other parts of data centers. BIG-IP Next for Kubernetes maps AI cluster namespaces to data center network tenancy, delivering proper security and simplified management. By taking advantage of the BlueField-3 DPU’s hardware accelerators, BIG-IP Next for Kubernetes accelerates a variety of networking and data services, optimizing energy efficiency by offloading CPU compute resources.

For example, at its Networking @Scale 2024 event earlier this year, Meta mentioned the training of its open-source learning language model (LLM) Llama 3 was hindered by network latency, which was addressed by tuning hardware-software interactions. This approach increased overall performance by 10%. While 10% may seem like a small gain, for a model that takes months to train, this improvement translates to weeks of saved time.

Reducing the complexity of AI deployments

F5 BIG-IP Next for Kubernetes deployed on BlueField-3 DPUs offers multiple benefits for service providers and large enterprises looking to build out large-scale AI infrastructures. These include:

  • Simplified integration: Until now, organizations faced the complexity of piecing together software components from different vendors to deliver and secure their AI applications. BIG-IP Next for Kubernetes combines networking, security, traffic management, and load balancing into a single solution, simplifying the complexity of AI deployments. It also offers an integrated view of these functions across AI infrastructure, along with the rich observability and granular control needed to optimize AI workloads. 
  • Enhanced security: BIG-IP Next for Kubernetes supports critical security features and zero trust architecture, including edge firewall, distributed denial-of-service (DDoS) mitigation, API protection, intrusion prevention, encryption, and certificate management—offloading these functions to the DPU and freeing up valuable CPU resources.
  • Improved performance: BIG-IP Next for Kubernetes accelerates networking and security, which is critical to meeting the demands required of AI infrastructure for delivering applications at cloud-scale.
  • Multi-tenancy support: BIG-IP Next for Kubernetes enables a multi-tenant architecture so service providers can securely host multiple users on the same AI infrastructure, while keeping their AI workloads and data separate.

Successfully delivering AI-optimized data centers

By carefully considering the challenges and available solutions, organizations can successfully deliver data centers optimized for AI without disrupting existing operations or compromising security. F5 BIG-IP Next for Kubernetes deployed on BlueField-3 DPUs emerges as a compelling option, providing seamless integration, enhanced security, and improved performance for AI workloads, including large-scale LLMs like Llama 3.

To learn more, read our press release and NVIDIA’s blog post.