Application Tracing with NGINX and NGINX Plus

NGINX | September 27, 2016

Photo: WordPress.org

Using Variables for Application Performance Management

Variables are an important and sometimes overlooked aspect of NGINX configuration. With approximately 150 variables available, there are variables to enhance every part of your configuration. In this blog post we discuss how to use NGINX variables for application tracing and application performance management (APM), with a focus on uncovering performance bottlenecks in your application. This post applies to both NGINX Open Source and NGINX Plus. For brevity we’ll refer to NGINX Plus throughout except when there is a difference between the two versions.

The Application Delivery Environment

In our sample application delivery environment, NGINX Plus is working as a reverse proxy for our application. The application itself is comprised of a web frontend behind which sit a number of microservices.

In a common deployment scenario, NGINX or NGINX Plus proxies requests from clients to an application or application server that consists of a web frontend and supporting microservices
Sample application delivery environment

Tracing Requests End‑to‑End

NGINX Plus R10 (and NGINX 1.11.0) introduces the $request_id variable, which is a randomly generated string of 32 hexadecimal characters automatically assigned to each HTTP request as it arrives (for example, 444535f9378a3dfa1b8604bc9e05a303). This deceptively simple mechanism unlocks a powerful tool for tracing and troubleshooting. By configuring NGINX Plus and all backend services to pass the $request_id value, you can trace every request end‑to‑end. This sample config is for our frontend NGINX Plus server.

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

To configure NGINX Plus for request tracing, we first define the network location of the application server in the upstream block. For simplicity we only show a single application server here, but we would normally use several, for high availability and load balancing purposes.

The application server block defines how NGINX Plus handles incoming HTTP requests. The listen directive tells NGINX Plus to listen on port 80 – the default for HTTP traffic, but a production configuration normally uses use SSL/TLS to protect data in transit.

The add_header directive sends the $request_id value back to the client as a custom header in the response. This is useful for testing and also for client applications that generate their own logs, such as mobile apps, so that a client‑side error can be matched precisely to the server logs.

Finally, the location block applies to the entire application space (/) and the proxy_pass directive simply proxies all requests to the application server. The proxy_set_header directive modifies the proxied request by adding an HTTP header that is passed to the application. In this case we create a new header called X-Request-ID and assign to it the value of the $request_id variable. So our application receives the request ID that was generated by NGINX Plus.

Logging $request_id End‑to‑End

Our goal with application tracing is to identify performance bottlenecks in the request‑processing lifecycle as part of application performance management. We do this by logging significant events during processing so we can analyze them later for unexpected or unreasonable delays.

Configuring NGINX Plus

We start by configuring the frontend NGINX Plus server to include $request_id in a custom logging format, trace, which is used for the access_trace.log file.

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

Configuring the Backend Application

Passing the Request ID to your application is all well and good, but does not actually help with application tracing unless the application does something with it. In this example we have a Python application managed by uWSGI. Let’s modify the application entry point to grab the Request ID as a logging variable.

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

Then we can modify the uWSGI configuration to include the Request ID in the standard log file.

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

With this configuration in place we are now producing log files which can be linked to a single request across numerous systems.

Log entry from NGINX:

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

Log entry from application:

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

By matching transactions against the Request ID fields, tools like Splunk and Kibana allow us to identify performance bottlenecks in your application server. For example, we can search for requests that took more than two seconds to complete. However, the default time resolution of one second in regular timestamps is insufficient for most real‑world analysis.

High‑Precision Timing

In order to accurately measure requests end‑to‑end we need timestamps with millisecond‑level precision. By including the $msec variable in log entries, we get millisecond resolution on the timestamp for each entry. Adding millisecond timestamps to our application log allows us to look for requests that took more than 200 milliseconds to complete rather than 2 seconds.

But even then we are not getting the full picture, because NGINX Plus writes the $msec timestamp only at the end of processing each request. Fortunately, there are several other NGINX Plus timing variables with millisecond precision that give us more insight into the processing itself:

  • $request_time – Full request time, starting when NGINX Plus reads the first byte from the client and ending when NGINX Plus sends the last byte of the response body
  • $upstream_connect_time – Time spent establishing a connection with the upstream server
  • $upstream_header_time – Time between establishing a connection to the upstream server and receiving the first byte of the response header
  • $upstream_response_time – Time between establishing a connection to the upstream server and receiving the last byte of the response body

For detailed information about these timing variables, see Using NGINX Logging for Application Performance Monitoring.

Let’s extend our log_format directive to include all of these high‑precision timing variables in our trace log format.

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

Using our preferred log analysis tool, we can extract variable values and perform the following calculation to see how long NGINX Plus took to process the request before connecting to the application server:

[@portabletext/react] Unknown block type "codeBlock", specify a component for it in the `components.types` prop

We can also search for the highest values of $upstream_response_time to see if they are associated with particular URIs or upstream servers. And these can then be further checked against application log entries that have the same Request ID.

Conclusion

Utilizing the new $request_id variable and some or all of the millisecond‑precision variables can provide great insight into the performance bottlenecks in your application, improving application performance management without having to resort to heavyweight agents and plug‑ins.

Try out application tracing with NGINX Plus for yourself: start your free 30‑day trial today or contact us to discuss your use cases.


Share

About the Author

Liam Crilly
Liam CrillySr Director, Product Management

More blogs by Liam Crilly

Related Blog Posts

Automating Certificate Management in a Kubernetes Environment
NGINX | 10/05/2022

Automating Certificate Management in a Kubernetes Environment

Simplify cert management by providing unique, automatically renewed and updated certificates to your endpoints.

Secure Your API Gateway with NGINX App Protect WAF
NGINX | 05/26/2022

Secure Your API Gateway with NGINX App Protect WAF

As monoliths move to microservices, applications are developed faster than ever. Speed is necessary to stay competitive and APIs sit at the front of these rapid modernization efforts. But the popularity of APIs for application modernization has significant implications for app security.

How Do I Choose? API Gateway vs. Ingress Controller vs. Service Mesh
NGINX | 12/09/2021

How Do I Choose? API Gateway vs. Ingress Controller vs. Service Mesh

When you need an API gateway in Kubernetes, how do you choose among API gateway vs. Ingress controller vs. service mesh? We guide you through the decision, with sample scenarios for north-south and east-west API traffic, plus use cases where an API gateway is the right tool.

Deploying NGINX as an API Gateway, Part 2: Protecting Backend Services
NGINX | 01/20/2021

Deploying NGINX as an API Gateway, Part 2: Protecting Backend Services

In the second post in our API gateway series, Liam shows you how to batten down the hatches on your API services. You can use rate limiting, access restrictions, request size limits, and request body validation to frustrate illegitimate or overly burdensome requests.

New Joomla Exploit CVE-2015-8562
NGINX | 12/15/2015

New Joomla Exploit CVE-2015-8562

Read about the new zero day exploit in Joomla and see the NGINX configuration for how to apply a fix in NGINX or NGINX Plus.

Why Do I See “Welcome to nginx!” on My Favorite Website?
NGINX | 01/01/2014

Why Do I See “Welcome to nginx!” on My Favorite Website?

The ‘Welcome to NGINX!’ page is presented when NGINX web server software is installed on a computer but has not finished configuring

Deliver and Secure Every App
F5 application delivery and security solutions are built to ensure that every app and API deployed anywhere is fast, available, and secure. Learn how we can partner to deliver exceptional experiences every time.
Connect With Us