The term 'cloud-scale' is often tossed around blithely. It's used in marketing a lot to imply REALLY BIG scale as opposed to, I suppose, traditional not-as-big-but-still-significant scale.
But just as there are kernels of truth hidden in myths and legends, there is some truth to the way in which the term cloud-scale is thrown around.
The truth is that cloud—and now containers—have changed the underlying foundation of scalability. This change is rooted in the fundamental differences between vertical and horizontal scaling strategies. For the first decade of the century we operated almost exclusively on the notion that vertical scale was the best way to achieve the speeds and feeds we needed. That meant more bandwidth, more compute, and more memory. More ports. Greater density. Faster processing.
But with the advent of the cloud era the focus flipped to horizontal scale. We still need more bandwidth and compute and processing power, but we've learned how to distribute that need. We still need more hardware, we just assemble it from multiple sources rather than a single, monolithic entity.
It's the way resources are assembled that changes the game. And make no mistake, the game has changed thanks to containers.
Scale today depends on the control plane. The speed of the API used to launch and decommission resources is perhaps more important than the speed of the load balancing service itself. The speed of service discovery in an environment where resources are launched and retired in a matter of minutes becomes paramount to delivering requests to an available instance.
More than half (52%) of containers have a lifespan of less than five minutes according to the Sysdig Container Usage Report 2019:
Nearly half (42%) operate between 201 and 500 container instances. To maintain accuracy, the control plane is updating components frequently. Far more frequently than even cloud, and certainly far more frequently than every seen with monolithic applications.
The speed with which the ingress controller (the load balancing mechanism) is updated to reflect the current set of available resources, then, becomes a critical capability. Because if it’s wrong, a consumer request could be directed to a resource that no longer exists—or is hosting a completely different service. Either way, the result is a longer response as the request is redirected to a resource that is available. The consumer must wait longer for a response and may choose to walk away instead.
All this points to the speed of the control plane as a blocking factor in the scalability of applications deployed in a containerized environment.
Ultimately this means the scale of the control plane is an issue. The design of the API is an issue. The mechanisms by which requests and updates to the API are authenticated and authorized is an issue.
The control plane is on center stage when it comes to scalability today. A robust, scalable control plane is not a nice to have. It's an RFC MUST today.