Today, F5 is announcing general availability of F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs, enhancing AI factory cloud data center resource management while achieving optimal AI application performance. The integrated solution boosts infrastructure efficiency and delivers high-performance networking, security, and traffic management to support innovative use cases, including GPU as a Service (GPUaaS) and inferencing capabilities.
Integrating BIG-IP Next for Kubernetes with NVIDIA BlueField-3 DPUs addresses pressing challenges that organizations face in implementing cloud-scale AI infrastructures. Large-scale AI workloads involve massive data processing that require high-performance computing resources to analyze, interpret, and extract insights in real time. This places considerable strain on traditional network infrastructure, inhibiting performance and risking processing inefficiency and inference delays.
Performance in industry-defining environments
F5, NVIDIA, and SoftBank recently collaborated on a session at NVIDIA GTC 2025 to showcase the value of a combined solution. During the session, SoftBank shared game-changing insights on how organizations can turbocharge cloud-native AI workloads with DPU-accelerated service proxy for Kubernetes. The session featured SoftBank's calculations and performance metrics from their recent proof-of-concept for F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs. SoftBank achieved an 18% increase in HTTP throughput (77 Gbps), an 11x improvement in time-to-first-byte (TTFB), and a staggering 190x boost in network energy efficiency. These results highlight the transformative potential of DPU acceleration for modern cloud-native environments, driving improved throughput of tokens and enhanced user experiences during AI inferencing.
Less complexity, optimized performance, and enhanced security
NVIDIA BlueField-3 DPUs are designed for the most demanding infrastructure workloads, from accelerated AI and 5G wireless networks to hybrid cloud and high-performance computing. The combined solution leverages the F5 Application Delivery and Security Platform to accelerate, secure, and streamline data traffic as it flows in and out of AI infrastructures, greatly improving the efficient processing of large-scale AI workloads. By delivering optimized traffic management, the solution enables greater data ingestion performance and server utilization during AI inferencing, leading to better experiences for users of AI apps.
BIG-IP Next for Kubernetes significantly eases the complexity of integrating multiple elements of enterprise AI infrastructure by unifying networking, security, traffic management, and load balancing functions to provide comprehensive visibility across multicloud environments, with heightened observability for AI workloads. The solution supports critical security features for zero trust architectures, API protection, intrusion prevention, encryption, and certificate management. With general availability, hardware-accelerated distributed denial-of-service (DDoS) mitigation has been added, along with edge firewall capabilities, promoting faster and more efficient cyber protection. The solution also automates the discovery and securing of AI model training and inferencing endpoints, empowering organizations to isolate AI applications from targeted threats while bolstering data integrity and sovereignty.
In addition, the integration of BIG-IP Next for Kubernetes and NVIDIA BlueField-3 DPUs enables a multi-tenant architecture that can securely host multiple users on the same AI clusters, while keeping their AI workloads, data, and traffic separate.
Compelling new use cases to help customers embrace AI
Together, F5 and NVIDIA not only improve infrastructure management and efficiency but also enable faster, more responsive AI inferencing to deliver emerging use cases, such as:
- GPU as a Service (GPUaaS) provides cloud-based, on-demand access to GPUs for a variety of computing tasks, including AI model training, scientific simulations, and rendering. The service allows organizations to rent GPU computing resources from cloud providers on a pay-as-you-go or subscription basis, paying for GPUs only when needed and maximizing the amount of GPU computing they can get per dollar spent. The integration of BIG-IP Next for Kubernetes with NVIDIA BlueField-3 DPUs enables secure multi-tenancy with granular tenant isolation, which is critical for GPUaaS scenarios because it allows multiple users or organizations to securely and efficiently share GPU resources while running concurrent workloads. By splitting the GPU service into multiple secure instances, granular multi-tenancy isolates different tenants and workloads to prevent data leakage and security risks. It also allows dynamic resource allocation, which ensures that each workload receives the necessary GPU and network resources without over-provisioning.
- Inferencing services, in which specialized cloud-based AI platforms provide optimized environments for efficiently providing inferencing on trained AI models. Distinct from GPUaaS, which provides raw GPU power, inferencing services are fine-tuned for streamlined model deployment. Examples of these services include operating chatbots, implementing fraud detection, performing research, and carrying out similar AI-powered tasks. Inferencing services are also used to optimize image recognition and autonomous driving scenarios, along with natural language processing for voice assistants or sentiment analysis. BIG-IP Next for Kubernetes and NVIDIA BlueField-3 DPUs maximize inferencing performance and reduce end-to-end latency by running multiple models concurrently. Inferencing services based on the combined F5 and NVIDIA solution can also dynamically scale resources to handle fluctuating workloads and demand.
For both GPUaaS and inferencing services, granular observability is a critical requirement. BIG-IP Next for Kubernetes provides a centralized and fully integrated view that offers rich visibility across the AI ecosystem to monitor performance and resilience, with the ability to instantly apply security features to enforce data privacy, prevent unauthorized access, and isolate anomalies.
For more information, explore the product page or contact your F5 account team to discuss BIG-IP Next for Kubernetes for your organization’s AI infrastructure. F5’s focus on AI doesn’t stop here—explore how F5 secures and delivers AI apps everywhere.
About the Author
Related Blog Posts

The everywhere attack surface: EDR in the network is no longer optional
All endpoints can become an attacker’s entry point. That’s why your network needs true endpoint detection and response (EDR), delivered by F5 and CrowdStrike.
F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift
F5 collaborates with Red Hat to deliver a solution that combines the high-performance app delivery of F5 NGINX with Red Hat OpenShift’s enterprise Kubernetes capabilities.

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture
F5’s inclusion within the NVIDIA Cloud Partner (NCP) reference architecture enables secure, high-performance AI infrastructure that scales efficiently to support advanced AI workloads.
F5 Silverline Mitigates Record-Breaking DDoS Attacks
Malicious attacks are increasing in scale and complexity, threatening to overwhelm and breach the internal resources of businesses globally. Often, these attacks combine high-volume traffic with stealthy, low-and-slow, application-targeted attack techniques, powered by either automated botnets or human-driven tools.
Volterra and the Power of the Distributed Cloud (Video)
How can organizations fully harness the power of multi-cloud and edge computing? VPs Mark Weiner and James Feger join the DevCentral team for a video discussion on how F5 and Volterra can help.
Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies
David Warburton, author of the F5 Labs 2020 Phishing and Fraud Report, describes how fraudsters are adapting to the pandemic and maps out the trends ahead in this video, with summary comments.

