Once upon a time I was a security consultant. I was assigned to review the firewall configuration for a sizeable Seattle startup of about 800 employees. They were in the business of hosting websites for thousands of small businesses across the world and therefore had a somewhat complex Internet connectivity setup. I sat down and reviewed their firewall configuration, which contained several hundred rule statements. The time/date stamp on the file told me this config had been in place for a couple of months without change. I skimmed through the file to get a general idea of what was going on. Then my jaw dropped when I got to the last entry: allow all. I asked the network administrators about it and no one could remember doing this, but the employee churn in this startup had been pretty high. No telling how long this rule had been in place.
The way this particular firewall parsed rules meant that the hundreds and hundreds of firewall deny rules in place before it were ignored in favor of this final rule. So basically, this meant that all traffic was allowed in any direction. In other words, for months, this company had no firewall in place. The rule was probably put in for diagnostic reasons during a system crisis. Some developer probably was just screaming, “It’s the firewall that’s breaking things. Just open it up and see if it works!” I know I’ve been in similar situations in the past, so someone probably wrote the rule. And then probably forgot to switch it back. Or quit soon afterward.
What effect did this have? Well, as I continued with my firewall replacement project, I did a week-long survey of all Internet network traffic. It turned out that nearly a quarter of the company’s outbound traffic was spam from compromised internal boxes. I guess someone found a good use for their sizeable Internet pipes. But more importantly, how does this kind of thing happen? And how common is it?
Misconfigurations Cause Breaches
Let’s answer the second question first. As part of a forthcoming research report on protecting applications, we looked in depth at data breach announcements to States Attorney General’s offices over the past 18 months. Out of known breach causes, 3% were attributed to access control misconfigurations. It’s a small number but not insignificant, especially for something that’s easily preventable. In fact, insiders were responsible for only 2% of breaches. An access control failure is more likely to cause a data breach than a malicious insider. This is worth looking into.
What do these access control failures look like? Here’s a sample of some of them:
- Global determined on February 23, 2018, that a database containing information related to current and former Global students was misconfigured and accessible to the Internet…
Global University — Announced May 20181
- In late 2017, a third-party vendor configured a server and left Company customer information exposed on the internet for some period of time.
Rx Valet — announced May 20182
- One of [MedWatch’s] vendors unintentionally misconfigured MedWatch’s online portal during a routine update, which allowed some internet search engines to potentially make certain information accessible on the internet…
Medwatch – announced April 20183
- Meepos determined that an unauthorized actor or actors gained access to certain parts of Meepos’s network due to a misconfiguration of our two-factor password authentication…
Meepos — announced July 20174
- The shared file server maintained by the Graduate School of Business (GSB) was accessible to GSB faculty, staff and students. The permissions on these folders were incorrectly changed around September 2016.
Stanford University — announced October 20175
More Firewall Fails
One of the benefits for working for a large security company like F5 is that I can ask around about what leads to most of the reported customer security incidents. Overwhelmingly I was told “partially configured firewalls.” These are devices that were put in place with the basics setup, but not completed. This could be the “we’ll get back to it” syndrome, where no one ever has time to get back to it. But we’re seeing attacks succeed only because defense tools that could have stopped them haven’t been enabled. We’re seeing default configurations that haven’t been updated. We’re seeing a lack of testing on key features to ensure they function as intended in production environments. And, like everything else in the industry, we’re seeing some folks not updating or patching in a timely matter.
What Is Going On?
There are a number of reasons why this is happening. One is the continuing talent shortage in cyber-security. There just aren’t enough talented folks to go around. Our networks and deployed controls are also getting more widespread and more complex. So, we have fewer people trying to work on more things. Worse, a lot of IT staff are doing their work manually in an ad-hoc manner. That means without defined process and without automation, which leads to more errors. Also, as organizations expand into the cloud, there is more reliance on cloud provider controls but with the mistaken assumption that the cloud provider is taking care of all aspects of security.
Fix the Process and Close the Gap
As I’ve said before, it’s always the simple things, but the simple things are hard. There are proven ways to eliminate IT errors,6 starting with change control. Firewalls and any critical security devices should be subject to change control. This means all changes are planned, analyzed, tested, approved, documented, and verified. Then someone should audit that process to ensure that every change follows the process. This would have eliminated the firewall hole that I found at the startup company.
On top of that, there should be a security standard for how access control should be done. A security standard is below a policy (a general statement of purpose) but above a procedure (a step-by-step guide to performing a task). An access control standard would spell out how devices in general should be configured and include points like:
- Default deny, least privilege should be used for access control. This means, if possible, inbound firewall rules should be set to trustworthy IP addresses.
- All default settings should be reviewed and changed; hardening procedures will be used.
- All security features should be reviewed for effectiveness against expected threats.
- Security signatures should be set up in a whitelist manner. Don’t block known bad; allow only known good. In some cases, this can mean all intrusion signatures are enabled except the known, documented, and approved false positives.
- All security administrators will have proper training in the usage of the device.
Another way to reduce errors is to only implement the controls you need to stop the most likely, most damaging threats. This means picking defensive technology that is flexible and can reduce the highest risks. I’ve seen environments where the controls in use are a dog’s breakfast of whatever the last few auditors recommended on top of a stack of legacy controls. Controls should be laser-focused on the current significant risks to the organization. You can also reduce errors with controls that support automation and integration. Ideally, you’d have a single pane of glass to monitor and manage all your controls.
Lastly, if this is all too much for your team to manage, you can look at outsourcing security controls. Since managing controls is the entire job of a SaaS security vendor, you can bet they have robust and well-tested processes in place to reduce operational errors.
More to come in our Application Protection Report July 25th
The perspective to write this article was actually derived from some breach analysis done by Whatcom Community College for an upcoming security flagship report. Accidents are the cause of a number of breaches, so we decided to explore that here, but more in-depth analysis and findings to come in our Application Protection Report here on F5 Labs on July 25, 2018!