APIs and the New-Old Problem of Visibility
So far in our 2019 Application Protection Research Series,1 we have explored reconnaissance campaigns directed against PHP, looked at causes for known breaches, and mapped specific attack techniques to different industries, business models, and architecture. We’ve looked into the details of the two most successful attack types, injection and access attacks, unpacked why they work, and explained how you can avoid being a victim of these types of attacks.
In this episode we investigate a component of web architecture whose importance is growing dramatically, and explore the implications that the use and configuration of this component has on protecting (or failing to protect) apps. Application Programming Interfaces (APIs) are processes and structures through which applications (or sub-application modules or functions) can connect to one another and communicate autonomously. While APIs have been around for a long time, the growth of serverless architectures, containers, and DevOps have made them absolutely critical for contemporary web operations. You might say that web applications have shifted from being blankets to being quilts, and APIs are the stitching that holds the quilts together.
APIs, then, are hard to categorize. They serve a combined access control and data translation role, coordinating disparate and distributed functions behind the scenes to present the user with a unified application service. From a security standpoint, this is what really matters: the user experience is that of one app. In other words, APIs raise the same fundamental issue of visibility that we discussed in the third episode.
Last year we released a follow-up to our 2018 Application Protection Report that focused on APIs. We highlighted two patterns of API use that were implicated in API breaches: 1) large platforms with numerous third-party integrations, and 2) mobile apps. Most of these attacks tended to follow the same form: access or injection attacks against the API authentication surface followed by exploitation of excessive permissions. In other words, APIs tend to be compromised in ways similar to breaches of other web applications, but because they are both increasingly important and hidden from view, they arguably represent a bigger risk to the business than other assets.2
As a result of this increased risk, we are devoting an entire episode to protecting APIs (and the applications they stitch together) to help the security industry keep pace with the change in the rest of the tech industry.
API Usage
APIs do not serve in an exclusively structural role to make applications work; they can transform an organization’s business model by directly generating revenue. In 2015, the Harvard Business Review reported that 90% of Expedia’s revenue came through APIs. Ebay and Salesforce also generated much of their revenue this way, reporting 60% and 50% API-driven revenue, respectively.3 As a result of the growth of APIs, organizations are recognizing new opportunities to generate traffic and revenue, often using existing components of their environments with minimal modification.
F5 Networks’ 2019 State of Application Services (SOAS) report found that of 2000 technology professionals who responded to the survey, 42% had already deployed API gateways, with 27% planning to deploy gateways in 2019.4 The API information portal and community forum programmableweb.com listed 12,000 APIs in their directory in 2015, and in early 2019 listed nearly 22,000, illustrating the rapid growth in their use.5
What Is an API?
In the simplest possible terms, an API is a user interface for other apps instead of users. (Technically, it is a standard to create a user interface for other apps). An API gateway is a piece of lightweight software, running on an application server, that creates a connection point through which other app services or mobile apps can push or pull data.6 Ideally, API gateways should have some form of access control, and they typically support a range of authentication, authorization, and federation services such as OAuth, JWT, OpenID Connect, and SAML. The API gateway authenticates the requesting device and validates that its request or payload is structured correctly for the software that sits behind the API. This allows other applications to use the output of the original service in a different way, in a different app, or in a different environment, without having to recreate the original service from scratch. This is what makes APIs such a great way to scale and embed functionality into other apps.
Google Earth provides an illustrative example of how APIs work. Most of us have run Google Earth, either through a desktop client or through a browser, to find satellite imagery of our own home, think briefly about how that’s kind of creepy, then go fwooosh around the globe. That’s the normal graphical user interface (GUI). But many of us have also used the mobile app, which pulls data from Google Earth servers through an API. In fact, many more apps use the Google Earth Engine API to achieve more specific goals, such as visualizing and measuring change on the surface of the Earth, than Google Earth itself does.7 One of my favorite Google Earth API tools is the Streetview Zombie Apocalypse simulator, which allows you to see how your neighborhood would fare if the dead walked.8 Google has even released an Earth Engine API dedicated exclusively to the user interface, enabling other app developers to create new UIs without having to reinvent the wheel on the backend.9
Types of APIs
APIs differ in their architectural styles, which in turn affects the format of the machine-machine interactions and how those APIs are deployed. API architecture is a complex topic that exceeds the scope of this article, but it is important to know a few terms and concepts:
REST APIs
REST stands for REpresentational State Transfer. REST is an architectural style (as opposed to a strict standard or protocol) that defines how computers interact with one another on the Internet. REST APIs are transactional in nature, receiving requests and providing responses on demand. They use standard HTTP methods (that is, GET, POST, PUT, PATCH, DELETE), require a specific location to which they can connect (that is, a URI).10 They also need to define data model standards to ensure that data are structured correctly. REST APIs make up the bulk of currently deployed APIs, having replaced most of the earlier generation of Remote Procedure Call (RPC) architectures, such as SOAP and XML-RPC.11
The fact that REST APIs use the standard HTTP methods is a big part of why they are so common. This standard allows them to be lightweight and straightforward to create and deploy, which means that they can be used across a wide variety of devices, even those with minimal processing power. For instance, REST APIs can allow an Internet of Things (IoT) device like a networked video security camera, which does not have a lot of computer power onboard, to work in a remote setting. A user connects to the camera through an app on the phone to see what happened overnight (or to check on cats/kids). On the backend, the mobile app for the security camera would be sending an API pull request to the camera, which would return a still photo across the web (we’ll talk about video data across APIs below).
RESTful APIs can be described in a format called OpenAPI 2.0 (formerly Swagger) that defines authentication modes, servers used, methods accepted, and other expected behavior. This is especially handy when implementing API security services.12
Real-time APIs
In contrast to the transactional model of REST APIs, real-time APIs offer a continuous feed of data. This can be advantageous in specific scenarios that are time-sensitive, such as for financial data, web chatting, and, in the example of our IoT webcam, sending out a stream of video instead of still photographs. Video information is time-sensitive because it needs to be reassembled from individual packets in the order in which they were taken before it can be viewed at the destination, and so streaming video is a great use case for real-time APIs.
Others
There are other kinds of APIs, such as native APIs embedded in systems for specific data output, or others that are only accessible through software development kits (SDKs). These are less common, and the differences are not relevant from a security standpoint, so we’ll leave it at that for now.
What Makes APIs a Good Target?
There is a combination of factors that make APIs such rich targets. One of the biggest issues is overly broad permissions. Because they are not intended for human use, APIs are often set up to access any data within the application environment. Permissions are usually set up for the user making the original request, and these permissions are, in turn, passed to the API. That is all well and good, until an attack bypasses the user authentication process, going directly to the downstream app. Because the API has unrestricted access, attacks through the API provide attackers with visibility into everything.
Like basic web requests, API calls incorporate URIs, methods, headers, and other parameters. All of these can be abused in an attack. In fact, most typical web attacks, such as injection, credential brute force, parameter tampering, and session snooping work surprisingly well. To attackers, APIs are an easy target for all their old, dirty tricks.
Another issue is visibility. As an industry, we haven’t often maintained a lot of situational awareness of APIs. They are supposed to run behind the scenes; this is great until they are compromised behind the scenes, and all of our valuables get stolen behind the scenes. As we noted in our API follow-up to last year’s report, APIs often connect to ports other than 80/443, they are frequently buried in deep paths somewhere on web servers, and the details of their architecture are often known only by development teams. All of this adds up to the reality that security teams may be unaware that connections with that potential impact are even possible in their environment.