Once upon a time I was a security consultant. I was assigned to review the firewall configuration for a sizeable Seattle startup of about 800 employees. They were in the business of hosting websites for thousands of small businesses across the world and therefore had a somewhat complex Internet connectivity setup. I sat down and reviewed their firewall configuration, which contained several hundred rule statements. The time/date stamp on the file told me this config had been in place for a couple of months without change. I skimmed through the file to get a general idea of what was going on. Then my jaw dropped when I got to the last entry: allow all. I asked the network administrators about it and no one could remember doing this, but the employee churn in this startup had been pretty high. No telling how long this rule had been in place.
The way this particular firewall parsed rules meant that the hundreds and hundreds of firewall deny rules in place before it were ignored in favor of this final rule. So basically, this meant that all traffic was allowed in any direction. In other words, for months, this company had no firewall in place. The rule was probably put in for diagnostic reasons during a system crisis. Some developer probably was just screaming, “It’s the firewall that’s breaking things. Just open it up and see if it works!” I know I’ve been in similar situations in the past, so someone probably wrote the rule. And then probably forgot to switch it back. Or quit soon afterward.
What effect did this have? Well, as I continued with my firewall replacement project, I did a week-long survey of all Internet network traffic. It turned out that nearly a quarter of the company’s outbound traffic was spam from compromised internal boxes. I guess someone found a good use for their sizeable Internet pipes. But more importantly, how does this kind of thing happen? And how common is it?
Let’s answer the second question first. As part of a forthcoming research report on protecting applications, we looked in depth at data breach announcements to States Attorney General’s offices over the past 18 months. Out of known breach causes, 3% were attributed to access control misconfigurations. It’s a small number but not insignificant, especially for something that’s easily preventable. In fact, insiders were responsible for only 2% of breaches. An access control failure is more likely to cause a data breach than a malicious insider. This is worth looking into.
What do these access control failures look like? Here’s a sample of some of them:
One of the benefits for working for a large security company like F5 is that I can ask around about what leads to most of the reported customer security incidents. Overwhelmingly I was told “partially configured firewalls.” These are devices that were put in place with the basics setup, but not completed. This could be the “we’ll get back to it” syndrome, where no one ever has time to get back to it. But we’re seeing attacks succeed only because defense tools that could have stopped them haven’t been enabled. We’re seeing default configurations that haven’t been updated. We’re seeing a lack of testing on key features to ensure they function as intended in production environments. And, like everything else in the industry, we’re seeing some folks not updating or patching in a timely matter.
There are a number of reasons why this is happening. One is the continuing talent shortage in cyber-security. There just aren’t enough talented folks to go around. Our networks and deployed controls are also getting more widespread and more complex. So, we have fewer people trying to work on more things. Worse, a lot of IT staff are doing their work manually in an ad-hoc manner. That means without defined process and without automation, which leads to more errors. Also, as organizations expand into the cloud, there is more reliance on cloud provider controls but with the mistaken assumption that the cloud provider is taking care of all aspects of security.
As I’ve said before, it’s always the simple things, but the simple things are hard. There are proven ways to eliminate IT errors,6 starting with change control. Firewalls and any critical security devices should be subject to change control. This means all changes are planned, analyzed, tested, approved, documented, and verified. Then someone should audit that process to ensure that every change follows the process. This would have eliminated the firewall hole that I found at the startup company.
On top of that, there should be a security standard for how access control should be done. A security standard is below a policy (a general statement of purpose) but above a procedure (a step-by-step guide to performing a task). An access control standard would spell out how devices in general should be configured and include points like:
Another way to reduce errors is to only implement the controls you need to stop the most likely, most damaging threats. This means picking defensive technology that is flexible and can reduce the highest risks. I’ve seen environments where the controls in use are a dog’s breakfast of whatever the last few auditors recommended on top of a stack of legacy controls. Controls should be laser-focused on the current significant risks to the organization. You can also reduce errors with controls that support automation and integration. Ideally, you’d have a single pane of glass to monitor and manage all your controls.
Lastly, if this is all too much for your team to manage, you can look at outsourcing security controls. Since managing controls is the entire job of a SaaS security vendor, you can bet they have robust and well-tested processes in place to reduce operational errors.