Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers

NGINX | February 22, 2024

Ilya KrutovSenior Product Marketing Manager

AI and machine learning (AI/ML) workloads are revolutionizing how businesses operate and innovate. Kubernetes, the de facto standard for container orchestration and management, is the platform of choice for powering scalable large language model (LLM) workloads and inference models across hybrid, multi-cloud environments.

In Kubernetes, Ingress controllers play a vital role in delivering and securing containerized applications. Deployed at the edge of a Kubernetes cluster, they serve as the central point of handling communications between users and applications.

In this blog, we explore how Ingress controllers and F5 NGINX Connectivity Stack for Kubernetes can help simplify and streamline model serving, experimentation, monitoring, and security for AI/ML workloads.

Deploying AI/ML Models in Production at Scale

When deploying AI/ML models at scale, out-of-the-box Kubernetes features and capabilities can help you:

Accelerate and simplify the AI/ML application release life cycle.
Enable AI/ML workload portability across different environments.
Improve compute resource utilization efficiency and economics.
Deliver scalability and achieve production readiness.
Optimize the environment to meet business SLAs.

At the same time, organizations might face challenges with serving, experimenting, monitoring, and securing AI/ML models in production at scale:

Increasing complexity and tool sprawl makes it difficult for organizations to configure, operate, manage, automate, and troubleshoot Kubernetes environments on-premises, in the cloud, and at the edge.
Poor user experiences because of connection timeouts and errors due to dynamic events, such as pod failures and restarts, auto-scaling, and extremely high request rates.
Performance degradation, downtime, and slower and harder troubleshooting in complex Kubernetes environments due to aggregated reporting and lack of granular, real-time, and historical metrics.
Significant risk of exposure to cybersecurity threats in hybrid, multi-cloud Kubernetes environments because traditional security models are not designed to protect loosely coupled distributed applications.

Enterprise-class Ingress controllers like F5 NGINX Ingress Controller can help address these challenges. By leveraging one tool that combines Ingress controller, load balancer, and API gateway capabilities, you can achieve better uptime, protection, and visibility at scale – no matter where you run Kubernetes. In addition, it reduces complexity and operational cost.

NGINX Ingress Controller can also be tightly integrated with an industry-leading Layer 7 app protection technology from F5 that helps mitigate OWASP Top 10 cyberthreats for LLM Applications and defends AI/ML workloads from DoS attacks.

Benefits of Ingress Controllers for AI/ML Workloads

Ingress controllers can simplify and streamline deploying and running AI/ML workloads in production through the following capabilities:

Model serving – Deliver apps non-disruptively with Kubernetes-native load balancing, auto-scaling, rate limiting, and dynamic reconfiguration features.
Model experimentation – Implement blue-green and canary deployments, and A/B testing to roll out new versions and upgrades without downtime.
Model monitoring – Collect, represent, and analyze model metrics to gain better insight into app health and performance.
Model security – Configure user identity, authentication, authorization, role-based access control, and encryption capabilities to protect apps from cybersecurity threats.

NGINX Connectivity Stack for Kubernetes includes NGINX Ingress Controller and F5 NGINX App Protect to provide fast, reliable, and secure communications between Kubernetes clusters running AI/ML applications and their users – on-premises and in the cloud. It helps simplify and streamline model serving, experimentation, monitoring, and security across any Kubernetes environment, enhancing capabilities of cloud provider and pre-packaged Kubernetes offerings with higher degree of protection, availability, and observability at scale.

Get Started with NGINX Connectivity Stack for Kubernetes

NGINX offers a comprehensive set of tools and building blocks to meet your needs and enhance security, scalability, and visibility of your Kubernetes platform.

You can get started today by requesting a free 30-day trial of Connectivity Stack for Kubernetes.

Featured Blog Posts

Current trends in cloud-native technologies, platform engineering, and AI

Full lifecycle management of app and API delivery with F5 NGINX One

Simplifying Modern Application Delivery: Nine Months of F5 NGINX One

About the Author

Ilya KrutovSenior Product Marketing Manager

More blogs by Ilya Krutov

Featured Blog Posts

Current trends in cloud-native technologies, platform engineering, and AI

Full lifecycle management of app and API delivery with F5 NGINX One

Simplifying Modern Application Delivery: Nine Months of F5 NGINX One

Related Blog Posts

NGINX | 10/05/2022

Automating Certificate Management in a Kubernetes Environment

Simplify cert management by providing unique, automatically renewed and updated certificates to your endpoints.

DNS,

NGINX | 05/26/2022

Secure Your API Gateway with NGINX App Protect WAF

As monoliths move to microservices, applications are developed faster than ever. Speed is necessary to stay competitive and APIs sit at the front of these rapid modernization efforts. But the popularity of APIs for application modernization has significant implications for app security.

F5 NGINX,

Tech,

2022

NGINX | 12/09/2021

How Do I Choose? API Gateway vs. Ingress Controller vs. Service Mesh

When you need an API gateway in Kubernetes, how do you choose among API gateway vs. Ingress controller vs. service mesh? We guide you through the decision, with sample scenarios for north-south and east-west API traffic, plus use cases where an API gateway is the right tool.

F5 NGINX,

Tech,

2021

NGINX | 01/20/2021

Deploying NGINX as an API Gateway, Part 2: Protecting Backend Services

In the second post in our API gateway series, Liam shows you how to batten down the hatches on your API services. You can use rate limiting, access restrictions, request size limits, and request body validation to frustrate illegitimate or overly burdensome requests.

NGINX | 12/15/2015