Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers

Ilya Krutov Thumbnail
Ilya Krutov
Published February 22, 2024

AI and machine learning (AI/ML) workloads are revolutionizing how businesses operate and innovate. Kubernetes, the de facto standard for container orchestration and management, is the platform of choice for powering scalable large language model (LLM) workloads and inference models across hybrid, multi-cloud environments.

In Kubernetes, Ingress controllers play a vital role in delivering and securing containerized applications. Deployed at the edge of a Kubernetes cluster, they serve as the central point of handling communications between users and applications.

In this blog, we explore how Ingress controllers and F5 NGINX Connectivity Stack for Kubernetes can help simplify and streamline model serving, experimentation, monitoring, and security for AI/ML workloads.

Deploying AI/ML Models in Production at Scale

When deploying AI/ML models at scale, out-of-the-box Kubernetes features and capabilities can help you:

  • Accelerate and simplify the AI/ML application release life cycle.
  • Enable AI/ML workload portability across different environments.
  • Improve compute resource utilization efficiency and economics.
  • Deliver scalability and achieve production readiness.
  • Optimize the environment to meet business SLAs.

At the same time, organizations might face challenges with serving, experimenting, monitoring, and securing AI/ML models in production at scale:

  • Increasing complexity and tool sprawl makes it difficult for organizations to configure, operate, manage, automate, and troubleshoot Kubernetes environments on-premises, in the cloud, and at the edge.
  • Poor user experiences because of connection timeouts and errors due to dynamic events, such as pod failures and restarts, auto-scaling, and extremely high request rates.
  • Performance degradation, downtime, and slower and harder troubleshooting in complex Kubernetes environments due to aggregated reporting and lack of granular, real-time, and historical metrics.
  • Significant risk of exposure to cybersecurity threats in hybrid, multi-cloud Kubernetes environments because traditional security models are not designed to protect loosely coupled distributed applications.

Enterprise-class Ingress controllers like F5 NGINX Ingress Controller can help address these challenges. By leveraging one tool that combines Ingress controller, load balancer, and API gateway capabilities, you can achieve better uptime, protection, and visibility at scale – no matter where you run Kubernetes. In addition, it reduces complexity and operational cost.

Diagram of NGINX Ingress Controller ecosystem

NGINX Ingress Controller can also be tightly integrated with an industry-leading Layer 7 app protection technology from F5 that helps mitigate OWASP Top 10 cyberthreats for LLM Applications and defends AI/ML workloads from DoS attacks.

Benefits of Ingress Controllers for AI/ML Workloads

Ingress controllers can simplify and streamline deploying and running AI/ML workloads in production through the following capabilities:

  • Model serving – Deliver apps non-disruptively with Kubernetes-native load balancing, auto-scaling, rate limiting, and dynamic reconfiguration features.
  • Model experimentation – Implement blue-green and canary deployments, and A/B testing to roll out new versions and upgrades without downtime.
  • Model monitoring – Collect, represent, and analyze model metrics to gain better insight into app health and performance.
  • Model security – Configure user identity, authentication, authorization, role-based access control, and encryption capabilities to protect apps from cybersecurity threats.

NGINX Connectivity Stack for Kubernetes includes NGINX Ingress Controller and F5 NGINX App Protect to provide fast, reliable, and secure communications between Kubernetes clusters running AI/ML applications and their users – on-premises and in the cloud. It helps simplify and streamline model serving, experimentation, monitoring, and security across any Kubernetes environment, enhancing capabilities of cloud provider and pre-packaged Kubernetes offerings with higher degree of protection, availability, and observability at scale.

Get Started with NGINX Connectivity Stack for Kubernetes

NGINX offers a comprehensive set of tools and building blocks to meet your needs and enhance security, scalability, and visibility of your Kubernetes platform.

You can get started today by requesting a free 30-day trial of Connectivity Stack for Kubernetes.

"This blog post may reference products that are no longer available and/or no longer supported. For the most current information about available F5 NGINX products and solutions, explore our NGINX product family. NGINX is now part of F5. All previous links will redirect to similar NGINX content on"