Is DevOps ‘Build to Fail’ Philosophy a Security Risk?

F5 Ecosystem | February 08, 2018

Lori Mac VittieDistinguished Engineer and Chief Evangelist

Breaking Betteridge’s Law of Headlines, the short answer is yes. But as all things today involving technology, the long answer is a bit more involved than that.

DevOps has become, I think, fairly pervasive across all industries. While not every organization adopts every aspect of the approach, or applies the same zealous adherence to its principles as, say, Netflix, it’s definitely a ‘thing’ that’s happening.

While not directly a proof point, when we asked the more than 3000 respondents how digital transformation was impacting their application decisions for our State of Application Delivery 2018, two of the top three answers were ‘employing automation and orchestration to IT systems and processes’ and ‘changing how we develop applications (for example, moving to agile)’. To me, both are inferred reactions to adopting at least portions of a DevOps approach to developing and delivering applications in modern architectures.

So, if orgs are adopting some of the tools and techniques related to DevOps, one might assume they’re also adopting others. One of those might even be (cue dramatic music): building to fail.

Now, that phrase is somewhat imprecise, as no one sits around designing systems to fail. What they do do, however, is design systems that are resilient to failure. That means, for example, if an instance (server) crashes, the system should be able to automatically handle the situation by removing the dead instance and starting a new one to take its place.

Voila! Built to fail.

And while this is certainly a desirable reaction – particularly when a system is under heavy load and demand – there is a risk in the approach that needs to be considered and, one hopes, subsequently addressed.

Consider the Cloudflare vulnerability of early 2017. Cloudflare – which has been admirably transparent in its own reporting of the issue – notes that basically, the problem was a memory leak (resulting in potential data leakage) caused by a defect in an extension of an HTTP parser. Long story short, bug caused memory leak which caused instances to crash. Those instances were killed and restarted because, built to fail.

For the record, this isn’t a ‘bash Cloudflare for a bug’ post. As a developer, I am highly sympathetic to having one’s defects exposed so publicly. I am less sympathetic in situations where there’s little regard for discovering why something is crashing or leaking memory or just outright failing.

Which is the point of today’s post. Because sometimes the DevOps philosophy leaves its adherents with a laissez-faire approach to post-failure investigation.

It is perfectly reasonable to react to a service/app failure by killing and restarting the service to ensure availability – as long as you then investigate the crash to determine what caused it. Apps don’t crash for no reason. If it fell over, something pushed it. Nine times out ten, it’s like a non-exploitable error. Nothing to write a blog post about. But the one time it’s a serious vulnerability waiting to be exploited makes it worth the ostensibly wasted effort on the other nine. Because that is something to write a blog post about.

It is not reasonable to ignore it.

Monitoring and alerting on failures and other issues is also a key component of a well-rounded DevOps program. That’s the “S” in the CAMS that make up a holistic DevOps approach: Culture, Automation, Measurement and Sharing. Damon and John (who coined the acronym back in 2010) were not just talking about pizza and beer (though that’s a good way to encourage the “C” for Culture of DevOps). It’s also about data and state of systems. It’s about ensuring that those who can benefit from knowing, know. And that includes a failure in the system.

A failure – particularly a crash – should not go unchecked. If a system in the pipeline crashes, someone should know about it and someone ought to check it out. To ignore it is a security risk. Worse, it’s an avoidable risk because it’s your environment, your systems, and your code. You have complete control, and thus no excuse to ignore it.

So yes, in a nutshell, ‘build to fail’ can expose your apps – and business – security risks. The good news is those risks are completely manageable, if you ensure that philosophy isn’t on paper as ‘built to fail’ but in practice winds up ‘built to ignore a fail'.

Pay attention to things that crash – even if you restart them to keep availability high. You may save yourself (and your business) from trending on Twitter for all the wrong reasons.

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Tags: DevOps, 2018

About the Author

Lori Mac VittieDistinguished Engineer and Chief Evangelist

More blogs by Lori Mac Vittie

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Related Blog Posts

F5 Ecosystem | 11/19/2025

F5 ADSP Partner Program streamlines adoption of F5 platform

The new F5 ADSP Partner Program creates a dynamic ecosystem that drives growth and success for our partners and customers.

F5 Application Delivery and Security Platform (ADSP),

Strategic alliance

F5 Ecosystem | 11/17/2025

Accelerate Kubernetes and AI workloads with F5 BIG-IP and AWS EKS

The F5 BIG-IP Next for Kubernetes software will soon be available in AWS Marketplace to accelerate managed Kubernetes performance on AWS EKS.

BIG-IP,

F5 on AWS

F5 Ecosystem | 11/11/2025

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift

F5 collaborates with Red Hat to deliver a solution that combines the high-performance app delivery of F5 NGINX with Red Hat OpenShift’s enterprise Kubernetes capabilities.

F5 NGINX,

2025

F5 Ecosystem | 10/28/2025

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

F5’s inclusion within the NVIDIA Cloud Partner (NCP) reference architecture enables secure, high-performance AI infrastructure that scales efficiently to support advanced AI workloads.

F5 Ecosystem | 08/26/2021

F5 Silverline Mitigates Record-Breaking DDoS Attacks

Malicious attacks are increasing in scale and complexity, threatening to overwhelm and breach the internal resources of businesses globally. Often, these attacks combine high-volume traffic with stealthy, low-and-slow, application-targeted attack techniques, powered by either automated botnets or human-driven tools.

Silverline Managed Services,

F5 Silverline DDoS Protection

F5 Ecosystem | 12/08/2020

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies

David Warburton, author of the F5 Labs 2020 Phishing and Fraud Report, describes how fraudsters are adapting to the pandemic and maps out the trends ahead in this video, with summary comments.

Fraud,

Phishing

Is DevOps ‘Build to Fail’ Philosophy a Security Risk?

About the Author

Related Blog Posts

F5 ADSP Partner Program streamlines adoption of F5 platform

Accelerate Kubernetes and AI workloads with F5 BIG-IP and AWS EKS

F5 NGINX Gateway Fabric is a certified solution for Red Hat OpenShift

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

F5 Silverline Mitigates Record-Breaking DDoS Attacks

Phishing Attacks Soar 220% During COVID-19 Peak as Cybercriminal Opportunism Intensifies

WHAT WE OFFER

RESOURCES

SUPPORT

PARTNERS

COMPANY