There are many technologies that have risen in 2023 that deserve to be on any technologist’s watch list. Among them is data masking. Despite the similarity with data leak prevention in implementation, data masking and data leak prevention have two very different use cases.
The latter has been a capability of every leading web app and API security solution for years. But data masking is just starting to make its need known, thanks to the rise of technologies like generative AI.
What is data masking?
Data masking is a technique used to protect sensitive information by replacing or obfuscating the original data with fictitious or scrambled data that maintains a similar structure and format. This method is commonly used in situations where data must be shared or used for testing, training, or analysis purposes, but the actual sensitive information should remain confidential. Data masking helps organizations comply with data privacy regulations, reduce the risk of data breaches, and protect the privacy of individuals whose information is contained within the datasets.
What is data leak prevention?
Data Leak Prevention (DLP) is a set of strategies, policies, and tools designed to protect sensitive information from unauthorized access, disclosure, or misuse. The primary goal of DLP is to prevent the accidental or intentional leakage of confidential data, such as personal information, intellectual property, or trade secrets, outside of an organization's network or systems.
Apples to apples?
It may seem like the market is embracing pedantry and claiming that green apples are different than red apples. After all, both data masking and DLP tend to rely on the same technologies to “mask” or “obfuscate” sensitive data fields used by applications and APIs.
The difference is two-fold.
First, the primary users of data masking are developers, data scientists, and MLOps. They are employees or partners who need to test and train with or analyze real customer data. That puts customers at risk who would rather remain anonymous and may have been assured by a corporate privacy policy that they will. DLP users are ultimately the business. It is a corporate responsibility to comply with regulations that require masking sensitive information such as account and credit card numbers, and the business suffers when data is leaked. It can be argued that organizations employ DLP to protect consumers—and they do—but the primary driver is usually regulation.
Second, DLP identifies and masks only a specific subset of personal information. When I get a bill my account number is masked, but my name and address aren’t. With data masking it is often the case that names, addresses, and other identifying information are obfuscated to ensure customers remain anonymous. This is particularly true when the use case is targeting analysis, where patterns and relationships are being sought across customers for marketing or forecasting purposes but there is a reason to not identify specific customers.
Data masking should be on your watch list
If you’re making a “watch list” of technologies for 2024, then data masking definitely deserves a place in the top ten.
This is because of the broad applicability to many efforts—but especially those that are leaning toward analysis and training ML models to glean insights about customer behavior or uncover patterns that inform business strategy.
As generative—and traditional—AI has begun to seep into every product and service on the planet, consumers have become increasingly aware of the need for privacy. Being able to mask sensitive data will allow a business to both push forward with AI initiatives and satisfy their customers’ need for privacy.
About the Author

Related Blog Posts
At the Intersection of Operational Data and Generative AI
Help your organization understand the impact of generative AI (GenAI) on its operational data practices, and learn how to better align GenAI technology adoption timelines with existing budgets, practices, and cultures.
Using AI for IT Automation Security
Learn how artificial intelligence and machine learning aid in mitigating cybersecurity threats to your IT automation processes.
The Commodification of Cloud
Public cloud is no longer the bright new shiny toy, but it paved the way for XaaS, Edge, and a new cycle of innovation.
Most Exciting Tech Trend in 2022: IT/OT Convergence
The line between operation and digital systems continues to blur as homes and businesses increase their reliance on connected devices, accelerating the convergence of IT and OT. While this trend of integration brings excitement, it also presents its own challenges and concerns to be considered.
Adaptive Applications are Data-Driven
There's a big difference between knowing something's wrong and knowing what to do about it. Only after monitoring the right elements can we discern the health of a user experience, deriving from the analysis of those measurements the relationships and patterns that can be inferred. Ultimately, the automation that will give rise to truly adaptive applications is based on measurements and our understanding of them.
Inserting App Services into Shifting App Architectures
Application architectures have evolved several times since the early days of computing, and it is no longer optimal to rely solely on a single, known data path to insert application services. Furthermore, because many of the emerging data paths are not as suitable for a proxy-based platform, we must look to the other potential points of insertion possible to scale and secure modern applications.
