Data Masking and How It Differs from Data Leak Prevention

F5 Ecosystem | November 20, 2023

Lori Mac VittieDistinguished Engineer and Chief Evangelist | F5

There are many technologies that have risen in 2023 that deserve to be on any technologist’s watch list. Among them is data masking. Despite the similarity with data leak prevention in implementation, data masking and data leak prevention have two very different use cases.

The latter has been a capability of every leading web app and API security solution for years. But data masking is just starting to make its need known, thanks to the rise of technologies like generative AI.

What is data masking?

Data masking is a technique used to protect sensitive information by replacing or obfuscating the original data with fictitious or scrambled data that maintains a similar structure and format. This method is commonly used in situations where data must be shared or used for testing, training, or analysis purposes, but the actual sensitive information should remain confidential. Data masking helps organizations comply with data privacy regulations, reduce the risk of data breaches, and protect the privacy of individuals whose information is contained within the datasets.

What is data leak prevention?

Data Leak Prevention (DLP) is a set of strategies, policies, and tools designed to protect sensitive information from unauthorized access, disclosure, or misuse. The primary goal of DLP is to prevent the accidental or intentional leakage of confidential data, such as personal information, intellectual property, or trade secrets, outside of an organization's network or systems.

Apples to apples?

It may seem like the market is embracing pedantry and claiming that green apples are different than red apples. After all, both data masking and DLP tend to rely on the same technologies to “mask” or “obfuscate” sensitive data fields used by applications and APIs.

The difference is two-fold.

First, the primary users of data masking are developers, data scientists, and MLOps. They are employees or partners who need to test and train with or analyze real customer data. That puts customers at risk who would rather remain anonymous and may have been assured by a corporate privacy policy that they will. DLP users are ultimately the business. It is a corporate responsibility to comply with regulations that require masking sensitive information such as account and credit card numbers, and the business suffers when data is leaked. It can be argued that organizations employ DLP to protect consumers—and they do—but the primary driver is usually regulation.

Second, DLP identifies and masks only a specific subset of personal information. When I get a bill my account number is masked, but my name and address aren’t. With data masking it is often the case that names, addresses, and other identifying information are obfuscated to ensure customers remain anonymous. This is particularly true when the use case is targeting analysis, where patterns and relationships are being sought across customers for marketing or forecasting purposes but there is a reason to not identify specific customers.

Data masking should be on your watch list

If you’re making a “watch list” of technologies for 2024, then data masking definitely deserves a place in the top ten.

This is because of the broad applicability to many efforts—but especially those that are leaning toward analysis and training ML models to glean insights about customer behavior or uncover patterns that inform business strategy.

As generative—and traditional—AI has begun to seep into every product and service on the planet, consumers have become increasingly aware of the need for privacy. Being able to mask sensitive data will allow a business to both push forward with AI initiatives and satisfy their customers’ need for privacy.

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Tags: Office of the CTO, 2023

About the Author

Lori Mac VittieDistinguished Engineer and Chief Evangelist | F5

More blogs by Lori Mac Vittie

Featured Blog Posts

F5 accelerates and secures AI inference at scale with NVIDIA Cloud Partner reference architecture

Securing AI models and agents without compromise: How F5’s acquisition of CalypsoAI will deliver end-to-end AI runtime protection

Quantum ready: A practical guide to enabling PQC with F5

Related Blog Posts

F5 Ecosystem | 11/24/2025

Multicloud chaos ends at the Equinix Edge with F5 Distributed Cloud CE

Simplify multicloud security with Equinix and F5 Distributed Cloud CE. Centralize your perimeter, reduce costs, and enhance performance with edge-driven WAAP.

API,

F5 Ecosystem | 10/22/2024

At the Intersection of Operational Data and Generative AI

Help your organization understand the impact of generative AI (GenAI) on its operational data practices, and learn how to better align GenAI technology adoption timelines with existing budgets, practices, and cultures.

F5 Ecosystem | 12/19/2022

Using AI for IT Automation Security

Learn how artificial intelligence and machine learning aid in mitigating cybersecurity threats to your IT automation processes.

Office of the CTO,

2022

F5 Ecosystem | 02/24/2022

Most Exciting Tech Trend in 2022: IT/OT Convergence

The line between operation and digital systems continues to blur as homes and businesses increase their reliance on connected devices, accelerating the convergence of IT and OT. While this trend of integration brings excitement, it also presents its own challenges and concerns to be considered.

Office of the CTO,

2022

F5 Ecosystem | 10/05/2020

Adaptive Applications are Data-Driven

There's a big difference between knowing something's wrong and knowing what to do about it. Only after monitoring the right elements can we discern the health of a user experience, deriving from the analysis of those measurements the relationships and patterns that can be inferred. Ultimately, the automation that will give rise to truly adaptive applications is based on measurements and our understanding of them.

2020,

Office of the CTO

F5 Ecosystem | 12/23/2019

Inserting App Services into Shifting App Architectures

Application architectures have evolved several times since the early days of computing, and it is no longer optimal to rely solely on a single, known data path to insert application services. Furthermore, because many of the emerging data paths are not as suitable for a proxy-based platform, we must look to the other potential points of insertion possible to scale and secure modern applications.

2019,

Office of the CTO