F5 BIG-IP Next for Kubernetes joins the NVIDIA Enterprise AI Factory validated design

F5 Ecosystem | January 05, 2026

At NVIDIA GTC and again at CES, NVIDIA has been clear about where enterprise AI is headed: AI factories—purpose-built, validated infrastructure and software stacks designed to deliver predictable performance, lower costs, and operational confidence for AI inference and training.

NVIDIA Enterprise AI Factory validated design brings this vision to life, combining NVIDIA accelerated computing, networking, software, and orchestration into a full-stack validated design that enterprises can deploy on-premises with confidence. As part of this architecture, NVIDIA continues to expand its ecosystem of validated partners that solve real, production-scale challenges across AI infrastructure.

Today, we’re excited to share that F5 BIG-IP Next for Kubernetes has been validated to run on NVIDIA RTX PRO Servers, featuring NVIDIA BlueField-3 DPUs, and is being included in the NVIDIA Enterprise AI Factory validated design. Customers, among the more than 20,000 organizations deploying F5 BIG-IP today, can seamlessly extend their trusted F5 BIG-IP capabilities as they deploy NVIDIA Enterprise AI factories.

F5 BIG-IP Next for Kubernetes on NVIDIA BlueField delivers high-performance traffic management, security, and AI-aware controls into a validated, enterprise-ready solution.

NVIDIA Enterprise AI Factory validated design is built around repeatable building blocks that ensure predictable performance, security, and scalability as enterprises move AI into production. Rather than assembling point solutions, the architecture defines how GPUs, DPUs, networking, and software work together as a cohesive system. F5’s inclusion in this validated design reinforces that approach.

Why AI factories need a full-stack approach

AI inference performance isn’t only just about faster GPUs. It’s also about the full stack that can optimize the systems. In production environments, networking, security, and traffic management increasingly determine how quickly tokens are generated, how consistently models respond, and how efficiently infrastructure is used.

As AI services scale across tenants, models, and users, enterprises need to effectively manage multiple elements. This is particularly critical for organizations deploying on-premises or sovereign AI environments, where performance, governance, and control cannot be delegated to public cloud services.

Enterprise needs include maximizing host CPUs that today must handle networking and security tasks, consistent latency for optimal time-to-first-token (TTFT), visibility and control over token usage, and the ability to enforce governance, fairness, and compliance.

This is where DPUs—and the software that runs on them—become foundational to the AI factory.

Offloading the right work to the right silicon

F5 BIG-IP Next for Kubernetes accelerated on NVIDIA BlueField-3 DPUs offloads critical networking and security functions from the host CPU to the DPU’s programmable ARM cores and hardware acceleration engines.

These offloaded services include load balancing and traffic steering, TLS termination and encryption, firewall and security policy enforcement, and API protection and intrusion detection.

By moving these functions onto the NVIDIA BlueField DPUs, host CPUs are freed for general-purpose workloads, while GPUs remain fully focused on AI inference.

The results are measurable and material—more than a 30% increase in token generation throughput and up to a 60% reduction in TTFT.

These gains translate directly into faster responses, higher model utilization, and improved infrastructure efficiency—exactly what enterprise AI factories demand.

Beyond offload: token governance and intelligent LLM routing

Offloading is only the starting point.

The programmable data plane within BIG-IP Next for Kubernetes, running on NVIDIA BlueField DPUs, enables advanced AI-aware services that go beyond traditional networking.

Token governance built in

BIG-IP Next for Kubernetes introduces native token governance capabilities that allow enterprises to count and track tokens per tenant, per user, or per model; enforce token rate limits and usage policies; and support compliance, chargeback, and fairness requirements.

As token-based pricing becomes the dominant economic model for AI, governance at the infrastructure layer is essential.

Intelligent LLM routing with NVIDIA NIM

Through integration with NVIDIA NIM microservices, BIG-IP Next for Kubernetes can dynamically route inference requests to the most appropriate model based on query complexity, performance requirements, or policy constraints.

This enables faster responses for simple queries, optimal model utilization across diverse workloads, and policy-driven routing aligned to cost, performance, and compliance goals.

Together, token governance and intelligent routing transform BIG-IP Next for Kubernetes from a networking component into a control plane for AI inference traffic.

A validated building block of the NVIDIA enterprise AI factory

NVIDIA RTX PRO Servers, complemented by NVIDIA BlueField-3 DPUs, represent the key accelerated computing platform components of this full-stack validated design for enterprise AI factories.

With BIG-IP Next for Kubernetes now validated on NVIDIA BlueField and included in the enterprise AI factory design, customers benefit from a production-ready, optimal networking and security layer to deliver deterministic performance for AI workloads. They also get improved GPU and CPU efficiency and built-in controls for token economics and governance.

This validation reinforces a shared vision between F5 and NVIDIA: AI infrastructure must be designed, not assembled.

Looking ahead

As enterprises move from AI experimentation to production AI factories, the infrastructure stack must evolve to support performance, efficiency, and governance at scale. As NVIDIA continues to evolve the Enterprise AI Factory, F5 will expand its role across traffic management, security, and AI-aware controls to support next-generation inference platforms at scale.

F5 BIG-IP Next for Kubernetes on NVIDIA BlueField DPUs delivers exactly that—combining high-performance traffic management, security, and AI-aware controls into a validated, enterprise-ready solution.

We’re proud to collaborate with NVIDIA as part of the Enterprise AI Factory ecosystem and look forward to expanding joint innovation and go-to-market efforts in the months ahead.

Learn more about the NVIDIA Enterprise AI Factory validated design and how F5 fits in.

Share

About the Author

Ahmed Guetari
Ahmed GuetariVice President, Product Management – Service Provider

More blogs by Ahmed Guetari

Related Blog Posts

AppViewX + F5: Automating and orchestrating app delivery
F5 Ecosystem | 12/19/2025

AppViewX + F5: Automating and orchestrating app delivery

As an F5 ADSP Select partner, AppViewX works with F5 to deliver a centralized orchestration solution to manage app services across distributed environments.

Build a quantum-safe backbone for AI with F5 and NetApp
F5 Ecosystem | 12/09/2025

Build a quantum-safe backbone for AI with F5 and NetApp

By deploying F5 and NetApp solutions, enterprises can meet the demands of AI workloads, while preparing for a quantum future.

F5 ADSP Partner Program streamlines adoption of F5 platform
F5 Ecosystem | 11/19/2025

F5 ADSP Partner Program streamlines adoption of F5 platform

The new F5 ADSP Partner Program creates a dynamic ecosystem that drives growth and success for our partners and customers.

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift
F5 Ecosystem | 11/11/2025

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift

F5 collaborates with Red Hat to deliver a solution that combines the high-performance app delivery of F5 NGINX with Red Hat OpenShift’s enterprise Kubernetes capabilities.

F5 Silverline Mitigates Record-Breaking DDoS Attacks
F5 Ecosystem | 08/26/2021

F5 Silverline Mitigates Record-Breaking DDoS Attacks

Malicious attacks are increasing in scale and complexity, threatening to overwhelm and breach the internal resources of businesses globally. Often, these attacks combine high-volume traffic with stealthy, low-and-slow, application-targeted attack techniques, powered by either automated botnets or human-driven tools.

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies
F5 Ecosystem | 12/08/2020

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies

David Warburton, author of the F5 Labs 2020 Phishing and Fraud Report, describes how fraudsters are adapting to the pandemic and maps out the trends ahead in this video, with summary comments.

Deliver and Secure Every App
F5 application delivery and security solutions are built to ensure that every app and API deployed anywhere is fast, available, and secure. Learn how we can partner to deliver exceptional experiences every time.
Connect With Us
F5 BIG-IP Next for Kubernetes joins the NVIDIA Enterprise AI Factory validated design | F5