Apache Arrow and OpenTelemetry: How open source fuels observability

F5 Ecosystem | April 19, 2023

There are fifteen gazillion statistics about how pervasive open source software is across enterprises in every industry. Apps are composed of more than 80% open source components, and the Internet basically runs on the open source software, NGINX.

But there are just as many open standards, as well. Standards developed and polished using an open source, community approach that yields incredible ecosystems of supporting products, projects, and infrastructure.

OpenTelemetry is one of those efforts, and it has become the standard for generating, ingesting, and processing operational data, a.k.a. telemetry. Nearly one-third (32%) of respondents to the Observability Innovation Report 2023 indicate that “OpenTelemetry support is required and 50% say it is very important in vendor products. Slightly more than one-third (36%) of respondents use OpenTelemetry within their organization.”

Standardizing telemetry is critical because observability relies on data points from the entire IT stack. That means network metrics, server logs, and traces—all that come from vastly different types of infrastructure and systems. There is no single source of truth because there are simply too many moving parts in even a simple application to ensure you can gather all the data you need to observe the state of the app at a given point in time. Standardizing the way telemetry is generated is one way to normalize digital signals and ensure analysis can leverage all the appropriate data points to deliver accurate, actionable insights.

But even standardizing telemetry does not solve all the challenges associated with reaching the holy grail of full-stack observability.

One of the big hairy problems of dealing with operational data is its volume. The digital signals organizations rely on to keep them abreast of potential problems with performance or attempted attacks are generated faster and more furiously than any other kind of data. We know this on a close and personal level because at F5, we have adopted OpenTelemetry as our standard across our portfolio. The nature and role of our products, like BIG-IP and NGINX, in delivering and securing applications and digital services means that significant volumes of data such as metrics and logs is generated for a variety of reasons. Transporting and processing that data is a significant portion of the cost associated with telemetry pipelines.

To address that challenge, Distinguished Engineer Laurent Quérel got involved with Apache Arrow and began working with the OpenTelemetry project to increase its efficiency with high telemetry volumes.

Our benchmark results show Apache Arrow provides significant advantages for transporting and processing telemetry data, particularly when it can be grouped into batches of several hundred entities or more. The columnar organization of the data enhances compressibility, and this memory layout greatly improves processing speed by optimizing the use of various cache levels and SIMD instructions. Furthermore, the Arrow ecosystem serves as an excellent complement to OpenTelemetry, enhancing its integration with query engines, stream processing pipelines, and specialized analytics file formats.

You can read more about Apache Arrow and Laurent’s work in the first of two articles on our experiences with the technology on the Apache Arrow site.

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Tags: Office of the CTO, 2023

About the Author

Lori Mac VittiePrincipal Evangelist

More blogs by Lori Mac Vittie

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Related Blog Posts

F5 Ecosystem | 10/22/2024

At the Intersection of Operational Data and Generative AI

Help your organization understand the impact of generative AI (GenAI) on its operational data practices, and learn how to better align GenAI technology adoption timelines with existing budgets, practices, and cultures.

F5 Ecosystem | 12/19/2022

Using AI for IT Automation Security

Learn how artificial intelligence and machine learning aid in mitigating cybersecurity threats to your IT automation processes.

Office of the CTO,

2022

F5 Ecosystem | 07/19/2022

The Commodification of Cloud

Public cloud is no longer the bright new shiny toy, but it paved the way for XaaS, Edge, and a new cycle of innovation.

Office of the CTO,

2022

F5 Ecosystem | 02/24/2022

Most Exciting Tech Trend in 2022: IT/OT Convergence

The line between operation and digital systems continues to blur as homes and businesses increase their reliance on connected devices, accelerating the convergence of IT and OT. While this trend of integration brings excitement, it also presents its own challenges and concerns to be considered.

Office of the CTO,

2022

F5 Ecosystem | 10/05/2020

Adaptive Applications are Data-Driven

There's a big difference between knowing something's wrong and knowing what to do about it. Only after monitoring the right elements can we discern the health of a user experience, deriving from the analysis of those measurements the relationships and patterns that can be inferred. Ultimately, the automation that will give rise to truly adaptive applications is based on measurements and our understanding of them.

2020,

Office of the CTO

F5 Ecosystem | 12/23/2019

Inserting App Services into Shifting App Architectures

Application architectures have evolved several times since the early days of computing, and it is no longer optimal to rely solely on a single, known data path to insert application services. Furthermore, because many of the emerging data paths are not as suitable for a proxy-based platform, we must look to the other potential points of insertion possible to scale and secure modern applications.

2019,

Office of the CTO

Apache Arrow and OpenTelemetry: How open source fuels observability

About the Author

Related Blog Posts

At the Intersection of Operational Data and Generative AI

Using AI for IT Automation Security

The Commodification of Cloud

Most Exciting Tech Trend in 2022: IT/OT Convergence

Adaptive Applications are Data-Driven

Inserting App Services into Shifting App Architectures

WHAT WE OFFER

RESOURCES

SUPPORT

PARTNERS

COMPANY