In this blog series, I explore the challenges of the information security practitioner, discussing how technical evolution simultaneously contributes new issues but presents new techniques for solving these issues. I begin with an academic question:
Can humans create a system so large that the problems surrounding it are not solvable at human scale?
There are a group of paradoxes that surround the concept of omnipotence, which are generally of the form, “Can an omnipotent being create a thing so large that the being cannot manipulate the thing.” This is fun for philosophers and theologians, and a nice segue into what we are exploring here—the fact that as non-omnipotent humans, we very certainly have created things that we cannot possibly manage at human scale: computer networks.
From a cyber security viewpoint, the problem is simple: you need to know what’s happening on your systems and your networks in order to prevent, detect, and eradicate actions and operations you don’t want. For most systems, most of the time, the activities are 99.999% benign and you really don’t need to capture and know the details. On our modern 10G network, generating (let’s say) 300 TB of data per day, maybe 0.0000001% of the traffic, or 30 million bytes, is nefarious badness. So, you don’t need those 300 TB of packets, you say? Just run that 300 TB of data past an IDS, have it tell you what’s wrong, and you’re good. I (and most of you that have operated large networks for any period of time) would say that’s about 300 TB of nope. The problem is, even a well-tuned intrusion detection system (IDS) at that scale will generate thousands of alerts per day, each with a fairly cryptic looking signature description and—here’s the really big problem—not enough detail for a human analyst to figure out what actually happened. In a series of increasingly more technical postings, I’m going to discuss the scope of this problem, why we have to solve it (thus preventing Skynet1), and some tools, techniques, and architectures we can use to deal with it all.
When I started working with computers in the early 1980s, we transmitted information at the relatively sedate pace of 10 million bits per second (about 100 times slower than typical modern non-core networks) on carefully shielded coax cables. An entire day’s worth of traffic on such a network (which did not include high capacity switch backplanes) would not even make a dent in a modern thumb drive.
Modern networks operate at much higher speeds, generally 1G to desktops (occasionally higher) with 10-40G backbones and switches that are capable of handling multiples of their port speeds; so, several 1G or 10G conversations can happen across a switch at once. A single 10G stream of data generates over 100 TB of data per day. If you assume a segmented or even switched network that allows for multiple 10G streams happening simultaneously, the numbers go from really big to really, really big very fast.
The bad guys depend on hiding in the noise. Much like a chameleon that actively tries to look like its innocuous surroundings, threat actors hide in the enormous volume of events. The challenge in detecting threats becomes not just seeing the lizard on a rock; it’s seeing the rock-colored lizard on a mountain of rocks on planet rock. So, figuratively speaking, we use image processing techniques to look for lizard eyeballs in the vast swath of rock we care about, and those detections will all be lizards… except for the lichen that looks like eyes, the two bugs that look like eyes, the billboard with a lizard on it, etc. To get to the real truth, we need context. And to interpret that context, in many cases, we need humans.
Getting away from the lizards for a minute, context in network activities is all about what actually traverses the net with both raw and meta-data sources available to us. Due to the enormous volumes of raw data, most organizations rely almost entirely on metadata or summarized information from sources like NetFlow, sFlow, or IPFIX. While these are excellent, informative sources at the high level for what actually occurred on a network, they won’t often let you determine with any degree of certainty whether it was a chameleon or a billboard picture of a chameleon. For final analysis, only the complete raw data stream is forensically sound and reliable. To illustrate this, I’ll use an example from Critical Informatics managed detection and response experience.
In this example, Critical Informatics analysts were alerted by various means that web activity at a client’s desktop was suspicious to the point of being potential criminal activity. In fact, the IDS signature correctly categorized web activity as associated with a criminal website with content that would implicate the desktop user in a pretty awful way. The metadata (website, IP address, offending desktop, etc.) was accurate. The site was verified to be what the IDS thought it was. And, the connection actually happened. Upon further investigation by human analysts, however, using the raw data (packet capture), it became clear that the web connection was an IFrame, inserted as ad content into a perfectly okay web search. (The most likely cause for the IFrame insertion and damning content was to actively extort the end user based on the “You downloaded X and I can prove it; I’ll take it to law enforcement unless you pay me…” scheme.) This was an innocent user, implicated by metadata with some pretty damning evidence. The only reason this innocent person was not escorted out of the door, potentially in handcuffs, was that human analysts with access to the raw packet capture were able to fully understand what happened. Every other bit of evidence pointed to that person being a real problem.
It’s actually quite easy to imagine other scenarios where a similar story plays out. Something looks bad but isn’t, or looks fine but is actually bad.
So, apparently, it’s hopeless… We often need packet-capture level detail for investigations that enable good incident response decisions, but there’s not enough disk in the world for all of us to capture that amount of detail. Nor is there enough time for us relatively slow humans (in comparison to computers) to get a look at the apparent lizard eyeballs and tell the incident response team that it’s lichen. And I haven’t even started to discuss the fact that more and more of the content is hidden in SSL/TLS streams…. More on that later.
Do not despair! It turns out that with some really great tools, infrastructure, computer science stuff, and specific processes, it is possible to capture just enough of what we need to make great decisions about what’s actually happening on our networks—without creating a giant black hole of stored packet capture on disk. Hopefully, this post has reasonably explained the scope of the problem and the real need to address this non-human scale data, partly by using humans more effectively. As we move further into this blog series, we will discuss how to use existing technologies and techniques to consistently and accurately identify the lizard and not go chasing after lizard-colored lichen.