The Art of Scaling Containers: Retries

F5 Ecosystem | January 11, 2018

Lori Mac VittieDistinguished Engineer and Chief Evangelist

Scaling containers is more than just slapping a proxy in front of a service and walking away. There’s more to scale than just distribution, and in the fast-paced world of containers there are five distinct capabilities required to ensure scale: retries, circuit breakers, discovery, distribution, and monitoring.

In this post on the art of scaling containers, we’ll dig into retries.

Retries

When you’re a kid playing a game, the concept of a retry is common to a lot of games. “Do over!” is commonly called out after a failure, in the hopes that the other players will let you try again. Sometimes they do. Sometimes they don’t. That rarely stops a kid from trying, I’ve noticed.

When scaling apps (or services if you prefer), the concept of a retry is much the same. A proxy, upon choosing a service and attempting to fulfill a request, receives an error. In basic load balancing scenarios this is typically determined by examining the HTTP response code. Anything other than a “200 OK” is an error. That includes network and TCP layer timeouts. The load balancer can either blindly return the failed response to the

requestor or, if it’s smart enough, it can retry the request in the hopes that a subsequent request will result in a successful response.

This sounds pretty basic, but in the beginning of scale there was no such thing as a retry. If it failed, it failed and we dealt with it. Usually by manually trying to reload from the browser. Eventually, proxies became smart enough to perform retries on their own, saving many a keyboard from wearing out the “CTRL” and “R” buttons.

On the surface, this is an existential example of the definition of insanity. After all, if the request failed the first time, why should we expect it to be successful the second (or the third, even?).

The answer lies in the reason for the failure. When scaling apps, it is important to understand the impact connection capacity has on failures. The load on a given resource at any given time is not fixed. Connections are constantly being opened and closed. The underlying web app platform – whether Apache or IIS, a node.js engine or a some other stack – has defined constraints in terms of capacity. It can only maintain X number of concurrent connections. When that limit is reached, attempts to open new connections will hang or fail.

If this is the cause of a failure, then in the microseconds it took for the proxy to receive a failed response a different connection may have closed, thereby opening up the opportunity for a retry to be successful.

In some cases, a failure is internal to the platform. The dreaded “500 Internal Server Error”. This status is often seen when server is not overloaded, but has made a call to another (external) service that failed to respond in time. Sometimes this is the result of a database connection pool reaching its limits. The reliance on external services can result in a cascading chain of errors that, like a connection overload, is often resolved by the time you receive the initial error.

You might also see the oh-so-helpful “503 Service Unavailable” error. This might be in response to an overload but as is the case with all HTTP error codes, they are not always good predictors of what actually is going wrong. You might see this in response to a DNS failure or in the case of IIS and ASP, a full queue. The possibilities are really quite varied. And again, they might be resolved by the time you receive the error, so a retry should definitively be your first response.

Of course you can’t just retry until the cows come home. Like the unintended consequences of TCP retransmission – which can overload networks and overwhelm receivers – retries eventually become futile. There is no hard and fast rule regarding how many times you should retry before giving up, but between 3 and 5 is common.

At that point, it’s time to send regrets to the requestor and initiate a circuit breaker. We’ll dig into that capability in our next post.

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Tags: 2018

About the Author

Lori Mac VittieDistinguished Engineer and Chief Evangelist

More blogs by Lori Mac Vittie

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Related Blog Posts

F5 Ecosystem | 11/17/2025

Accelerate Kubernetes and AI workloads with F5 BIG-IP and AWS EKS

The F5 BIG-IP Next for Kubernetes software will soon be available in AWS Marketplace to accelerate managed Kubernetes performance on AWS EKS.

BIG-IP,

F5 on AWS

F5 Ecosystem | 11/12/2025

The everywhere attack surface: EDR in the network is no longer optional

All endpoints can become an attacker’s entry point. That’s why your network needs true endpoint detection and response (EDR), delivered by F5 and CrowdStrike.

F5 Application Delivery and Security Platform (ADSP),

BIG-IP,

CrowdStrike,

Strategic alliance

F5 Ecosystem | 11/11/2025

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift

F5 collaborates with Red Hat to deliver a solution that combines the high-performance app delivery of F5 NGINX with Red Hat OpenShift’s enterprise Kubernetes capabilities.

F5 NGINX,

2025

F5 Ecosystem | 10/28/2025

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

F5’s inclusion within the NVIDIA Cloud Partner (NCP) reference architecture enables secure, high-performance AI infrastructure that scales efficiently to support advanced AI workloads.

F5 Ecosystem | 08/26/2021

F5 Silverline Mitigates Record-Breaking DDoS Attacks

Malicious attacks are increasing in scale and complexity, threatening to overwhelm and breach the internal resources of businesses globally. Often, these attacks combine high-volume traffic with stealthy, low-and-slow, application-targeted attack techniques, powered by either automated botnets or human-driven tools.

Silverline Managed Services,

F5 Silverline DDoS Protection

F5 Ecosystem | 12/08/2020

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies

David Warburton, author of the F5 Labs 2020 Phishing and Fraud Report, describes how fraudsters are adapting to the pandemic and maps out the trends ahead in this video, with summary comments.

Fraud,

Phishing

The Art of Scaling Containers: Retries

Retries

About the Author

Related Blog Posts

Accelerate Kubernetes and AI workloads with F5 BIG-IP and AWS EKS

The everywhere attack surface: EDR in the network is no longer optional

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

F5 Silverline Mitigates Record-Breaking DDoS Attacks

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies

WHAT WE OFFER

RESOURCES

SUPPORT

PARTNERS

COMPANY