NGINX Gateway Fabric, part of F5 NGINX One, delivers app, API, and model-aware routing with built-in governance and security.
AI inference workloads are unpredictable—prompts vary in size and compute intensity, making traditional routing inefficient. Static load balancing can overload backends or underutilize resources. NGINX Gateway Fabric enables model-aware routing, directing traffic based on model type, cost/performance profiles, and runtime signals for efficient, scalable AI delivery.

Traditional ingress relies heavily on annotations, leading to fragmented and hard-to-manage configurations. NGINX Gateway Fabric introduces expressive, Kubernetes-native CRDs that enable consistent, reusable, and declarative traffic policies across teams and environments.
As teams scale, managing shared infrastructure becomes complex and error prone. NGINX Gateway Fabric introduces role-oriented resources that separate platform and application concerns, enabling safe delegation while maintaining centralized governance and control.
NGINX Gateway Fabric provides a forward-looking model for traffic management, enabling teams to modernize Kubernetes connectivity while building on existing ingress patterns and operational familiarity.
NGINX Gateway Fabric delivers applications in Kubernetes using the Gateway API to manage traffic at scale. It enables platform teams to control access while application teams define routing, ensuring secure and consistent delivery to in-cluster services.
NGINX Gateway Fabric product documentationNGINX Gateway Fabric delivers Gateway API with advanced routing, policy control, and model-aware traffic management, enhanced by NGINX Plus features for observability, security, and reliability.
Route traffic by host, path, headers, and methods.
Enable canary releases with weighted routing.
Route AI traffic by model type, version, or cost/performance profile.
Apply reusable traffic and security policies.
Separate platform and app-level responsibilities.
Monitor traffic with metrics and live insights.
Update routing instantly without reloads.
Enforce TLS, mTLS, and authentication policies.

NGINX Gateway Fabric supports Inference extension for Gateway API to enable smart, inference-aware routing for Kubernetes
Read the blog