There are many technologies that have risen in 2023 that deserve to be on any technologist’s watch list. Among them is data masking. Despite the similarity with data leak prevention in implementation, data masking and data leak prevention have two very different use cases.
The latter has been a capability of every leading web app and API security solution for years. But data masking is just starting to make its need known, thanks to the rise of technologies like generative AI.
Data masking is a technique used to protect sensitive information by replacing or obfuscating the original data with fictitious or scrambled data that maintains a similar structure and format. This method is commonly used in situations where data must be shared or used for testing, training, or analysis purposes, but the actual sensitive information should remain confidential. Data masking helps organizations comply with data privacy regulations, reduce the risk of data breaches, and protect the privacy of individuals whose information is contained within the datasets.
Data Leak Prevention (DLP) is a set of strategies, policies, and tools designed to protect sensitive information from unauthorized access, disclosure, or misuse. The primary goal of DLP is to prevent the accidental or intentional leakage of confidential data, such as personal information, intellectual property, or trade secrets, outside of an organization's network or systems.
It may seem like the market is embracing pedantry and claiming that green apples are different than red apples. After all, both data masking and DLP tend to rely on the same technologies to “mask” or “obfuscate” sensitive data fields used by applications and APIs.
The difference is two-fold.
Second, DLP identifies and masks only a specific subset of personal information. When I get a bill my account number is masked, but my name and address aren’t. With data masking it is often the case that names, addresses, and other identifying information are obfuscated to ensure customers remain anonymous. This is particularly true when the use case is targeting analysis, where patterns and relationships are being sought across customers for marketing or forecasting purposes but there is a reason to not identify specific customers.
If you’re making a “watch list” of technologies for 2024, then data masking definitely deserves a place in the top ten.
This is because of the broad applicability to many efforts—but especially those that are leaning toward analysis and training ML models to glean insights about customer behavior or uncover patterns that inform business strategy.
As generative—and traditional—AI has begun to seep into every product and service on the planet, consumers have become increasingly aware of the need for privacy. Being able to mask sensitive data will allow a business to both push forward with AI initiatives and satisfy their customers’ need for privacy.