Fake Account Creation Bots – Part 4

Introduction

This is the fourth article in our series on fake account creation bots. The previous articles have introduced these bots, described how they work. and discussed the motivations behind their use. We also covered the negative impact that fake account creation bots have on different kinds of businesses and why business and security leaders should care. The previous articles covered different approaches to identify fake accounts at scale. This article will focus on evaluating the different security controls used to identify and mitigate fake accounts and fake account creation bots, as well as highlighting the techniques that bots use to try and circumvent these controls.

Evaluation of Controls

There are a lot of different controls used by security teams to address the problem posed by fake accounts and fake creation bots. Some of these controls are designed to address the bots at the point of account creation, while others are designed to identify the activity of fake accounts and mitigate them post-creation. There are pros and cons of each approach.

We’ve ranked the efficacy of each of these controls on a scale of 1 to 5, with one being not particularly effective, and 5 being the most effective.

Email Domain Filtering

Email domain filtering involves blocking account creation using email addresses from low reputation domains and temporary email address providers. This increases the cost and effort required for attackers because it forces them to obtain higher quality email addresses to be able to create fake accounts.

The advantages of this control include its low cost, fast time to implementation, and its effectiveness against unsophisticated bots. However, it is ineffective against more sophisticated bots, and is relatively easy for attackers to re-tool to circumvent, leading to poor long-term efficacy. We rate this a 1 on our efficacy scale.

Multi-Step Account Creation Process

Multi-step account creation processes require users to go through a series of steps and pages to be able to create an account. Unsophisticated bots are unable follow redirects or complete multiple step account creation processes.

The advantages of this control apply primarily to low sophistication bots. The downside is higher legitimate user friction in account creation, and the cost of implementing and maintaining a complex multi-step account creation process, which are inefficient in the long term, especially given that it’s only effective against the least sophisticated bots and attackers. We rate this a 1 on our efficacy scale.

Reporting by Legitimate Users

This control involves soliciting and gather reports and complaints from legitimate users to attempt to identify suspected fake accounts. An organization with a large user base may want to try to harness that user base to serve as a reporting system. This approach is easy and reasonably simple to set up but will require a team of humans to review reports, and does not have great visibility into the problem, as only a small number of users will be motivated enough to submit reports of fake account activity. Finally, users are often not the best at identifying fake account activity, leading to a high rate of both false positives and false negatives. We rate this a 1 on our efficacy scale.

Email Verification

Email verification requires a two-step process in account creation to ensure that the email address provided is real and in the control of the user attempting to create the account. This increases attacker cost, as they must manage a large number of email accounts. While effective at mitigating the most simple of automated account creation bots, it can be bypassed by using temporary or disposable email addresses. Nevertheless, it tends to incur cost on many levels of attackers, so we rate this a 2 on our efficacy scale.

Requirement for Unique and Limited Availability Information

By requiring unique information that only a legitimate user should know, such as credit card numbers, bank account numbers, phone numbers, and the like, this technique attempts to increase the difficulty of creating fake accounts for attackers, who will have to expend additional effort to gather stolen information or conduct additional research to obtain it. Additionally, an organization can enforce policies to ensure that this information is used only by one unique account, limiting the creation of additional fake accounts using the same data. Further, the limited nature of this information means that even if attackers get enough information to create some accounts, there is a natural limit on how many they can create.

Unfortunately, this approach comes at some cost. User friction may be higher and there may be privacy concerns, both on the part of users, and from a regulatory standpoint. It does not stop fake account creation entirely, and if done without verification (see below) attackers can use brute force techniques to find valid information (such as phone numbers). If they do this, users who have not yet created accounts who own those pieces of information will find themselves unable to create accounts, leading to frustration and a negative experience with the site in question. We thus rate this a 2 on our efficacy scale.

CAPTCHA

The familiar “prove you are a human” tests, which involve completing a CAPTCHA challenge before the account will be created, and periodically when being used. This is effective against simple bots and is cheap and easy to implement. But sophisticated bots have been shown to be effective at solving CAPTCHAs and CAPTCHA solving services may be hired by attackers at low cost. We rate this control a 2 on our efficacy scale.

Honeypot Fields

This technique involves creating hidden fields on the account creation page that humans will not see, and thus not fill out, but that will be filled out by bots. This too is effective against simple bots and is cheap and easy to implement. However, it is trivial for more sophisticated actors to engineer around and does not provide any long-term efficacy. We rate this control a 2 on our efficacy scale.

Web Application Firewall (WAF)

Web application firewalls can be used to inspect and block traffic based on a variety of layer 3 through layer 7 signals, such as IP address, IP derived geolocation, User-Agent string, and other HTTP headers. However, bot operators who use distributed infrastructure are much harder to stop completely, and this solution isn’t granular enough to mitigate bots that use infrastructure also used by legitimate users, leading to false positives. It can also be expensive to operate as analysts are needed to play “whack-a-mole” with attackers. We rate this control, for bots, a 2, although we want to stress that WAF is a very viable solution for many other web-based threats.

Basic Bot Defenses

A bot solution that uses simple JavaScript challenges, device fingerprinting, and basic user behavior analytics to identify and block automated activity. Because it works on specific signals at the level of a specific bot, it is quite effective against low and medium sophistication bots, and can prevent both the creation of fake accounts and detect the operations of fake account bots.

It is not as effective against high sophistication bots and can be bypassed in the medium and long term via attacker retooling, so it has a limited window of long-term efficacy. We rate this approach a 3 on our scale.

Verification of Unique and Limited Availability Information

By periodically requiring validation of previously collected information such as phone number, address, credit card numbers, and the like, as mentioned in a previous control definition, this control attempts to ensure the data is in the control of the user, which can create a barrier to automated account use. However, it’s not always practical to enforce that all information be tied to a single account. It can be costly and time-consuming to validate this information, and such verification also creates a significant amount of user friction. We rate this as a 4 on our efficacy scale.

Data Analytics

Use of data science approaches to detect fake accounts, including machine learning and AI (some of which were mentioned in “Fake Account Creation Bots – Part 3”). This can provide high levels of efficacy against basic to intermediate sophistication bots and can learn and become more effective over time as more data is obtained. It can also be quite flexible in the case of attacker retooling. It does however require large amounts of high-quality data, and skilled personnel that some companies may not have access to. Further, it is only a remedial process, taking place after the fact, and cannot prevent the creation of fake accounts in the first place. It has a limited level of efficacy against the most sophisticated fake account bots, and against some evasion techniques. We rate this a 4 on our efficacy scale.

Advanced Bot Defenses

Advanced bot defense solutions, in addition to the techniques discussed in Basic Bot Defenses above, use AI and ML to respond automatically and dynamically to changing attacker tactics. This is very effective against bot activity from the most basic to the most sophisticated and is able to adapt to changes in bot tactics. It provides good long-term efficacy, and can prevent both the use, and creation, of fake accounts. It can be expensive to deploy, and the implementation of it can be quite involved. We however give this control a 5 on our efficacy scale.

The following table summarizes our rankings for each control in terms of efficacy.

Control	Efficacy Rating
Email Domain Filtering	1
Multi-Step Account Creation Process	1
Reporting by Legitimate Users	1
Email Verification	2
Requirement for Unique and Limited Availability Information	2
CAPTCHA	2
Honeypot Fields	2
Web Application Firewall (WAF)	2
Basic Bot Defenses	3
Verification of Unique and Limited Availability Information	4
Data Analytics	4
Advanced Bot Defenses	5

Table 1: Summary of efficacy ratings for multiple anti-bot controls

How Fake Account Creation Bots Circumvent Controls

There are several approaches that fake account bots use to avoid detection and the various security controls detailed above. A number of these approaches were alluded to in part three of this series when we spoke about the different ways to detect fake accounts. This section covers how attackers attempt to ensure those detection approaches are not effective against the attacker’s bots and accounts.

Acquire Large Numbers of High-quality Email Addresses

The first step in circumventing bot controls is to create large numbers of high-quality email addresses. Many fake account creation bot controls rely on the assumption that it is difficult or expensive for bots to acquire large numbers of high-quality email addresses. Such controls trust that the top email providers like Google, Microsoft, Yahoo, etc. have sufficient anti-bot controls to prevent fake account bots acquiring large numbers of high-quality email addresses. In our experience this is not the case, as we have observed hundreds of thousands of fake accounts created using high quality email addresses from some of the most reputable and common email providers.

These accounts can either be created using custom code or off-the-shelf automation tools. Tools are readily available that can assist in the creation of large numbers of accounts on all the major email domains. Attackers can also opt to outsource the creation of the accounts and simply purchase such accounts. These accounts can either be brand new accounts created by a supplier or can be legit email domains that have been stolen and taken over by criminals and are being resold. Stolen email addresses do not have identifiable patterns which makes it very difficult to identify them as fake bot accounts. In the event of a single account being discovered, it is also hard to identify other related accounts or email addresses. This makes username analysis to identify such fake accounts impossible.

Attackers can also use email address fuzzing. This allows a single email address to be used to create large numbers of fake accounts by adding extra characters to the email address. This process negates the need to acquire large numbers of high-quality emails, as a handful of high-quality email addresses can be used to create tens of thousands of accounts.

All these approaches also make double opt-in or email address verification ineffective as the bots have access to the underlying email addresses needed to complete the verification step and successfully create the account. This process also allows the bots to circumvent 2FA controls that may require codes to be retrieved from the email address.

Make Fake Accounts Harder to Identify and Correlate

The second way in which bots circumvent controls is to avoid as many of the approaches discussed in our previous articles as possible. This involves:

Avoiding low reputation email domains
Avoiding clear and common username patterns
Not using the same password for all fake accounts but rather using a password manager or Client Relationship Management (CRM) style system to manage a database of unique fake accounts and passwords
Using stolen legitimate email addresses/accounts
Avoiding generic or repeated profile information by using generative AI and large language models (LLMs) to create large populations of distinct fake account profiles, content, and images
Spreading fake account creation over a long period of time so that accounts have different creation dates
Making the links between fake accounts and their coordination harder to identify

This last element deserves special explanation. Obfuscation occurs by spreading activity over time so fake account actions are not simultaneous. More sophisticated bots also divide the fake account population into randomly generated cohorts using a small percentage of all fake accounts created. This makes it hard to identify coordination as each account will have coordinated actions with different sets of accounts over time, which will look natural and coincidental.

Make the Use of Automation Harder to Identify and Mitigate

To effectively create and manage large cohorts of fake accounts, it is imperative that automation be used. This was discussed in a lot of detail in part 2 of this series. To circumvent detection and mitigation, it is important that fake account creation bots do not display the typical signs of automated traffic.

The bots need to ensure that they operate at a low volume to stay below detection thresholds for most controls. The threshold for account creation must be significantly lower than thresholds for other kinds of automated activity like logging in, as each account only needs to be created once. A low and slow approach to account creation, as well as to the subsequent fake account activity will reduce the risk of detection. A large network of IP addresses, ASNs and User agent (UA) strings will make it harder for automated activity to be detected. Simple unsophisticated bots will use a single IP, ASN, UA or header order for all their transactions, making it easy for them to be detected and mitigated. This is especially true if the IPs, ASNs, UAs and header orders are unique to the bot and not seen in legitimate traffic. Sophisticated bots will use common IPs, ASNs, UAs and header orders that are like those used by the majority of legitimate customers e.g. consumer IPs with the latest Chrome UA string and header order. This makes it very difficult to block this sophisticated bot activity using traditional basic controls without significant risk of false positives.

Develop Capabilities to Defeat Common Anti-Bot Controls

To be able to successfully mitigate fake account creation bots, websites and apps would need to deploy robust and sophisticated anti-bot systems. However, most websites and apps have little to no anti-bot controls, and those that do, typically have very basic ones that are easy to bypass. It is important for a fake account bot to have the following list of basic capabilities to be able to bypass most basic anti-bot controls.

Ability to Run JavaScript (JS)

Most basic bot defense products on the market rely on simple JS challenges to identify automation. Using a headless browser i.e. a browser that does not have Graphical User Interface (GUI) and is controlled programmatically using code, like Headless Chrome or using tools like Selenium that can programmatically control a full web browser, can easily defeat most basic anti-bot controls.

Ability to Solve CAPTCHA/reCAPTCHA

Many websites and apps rely on these controls to mitigate bot activity. It is important that a fake account bot be able to solve and bypass these challenges. This can be done using programmatic solutions like optical character recognition (OCR), computer vision and AI, or by using human click farms and CAPTCHA solving services like 2CAPTCHA and Death by CAPTCHA. All these approaches are relatively inexpensive and easy to deploy, with many bots for sale already having these features built in.

Use of Residential IP Proxies

Most anti-bot controls rely on IP reputation, geo-location and blocking bulletproof hosting ASNs and TOR exit nodes to mitigate bots. Sophisticated bots utilize a large pool of high-quality residential IP addresses from unsuspicious geographic locations.

There are several approaches used to achieve this, including the use of Mmobile device botnets or compromised routers that use real user IP addresses. Attackers can also use services like Luminati that provide a network of high-quality residential IP addresses, making bots harder to detect using IP reputation and similar approaches.

Use of Phone Farms and Real Devices

Some anti-bot controls rely on the ability to detect emulated or spoofed mobile devices. Sophisticated bots will use hundreds of real mobile devices in phone farms to generate their traffic. Others will go as far as downloading and installing the native mobile app in a coordinated fashion on these hundreds of devices to appear legitimate. This approach will defeat bot defense solutions that detect fake devices or apps running in emulated environments.

Use of Browser Fingerprint Switching Capabilities or Anti-Fingerprinting or Privacy Browsers

This approach will defeat anti-bot solutions that rely on browser/device fingerprinting as a way of identifying unwanted bot activity. There are several off the shelf tools that have these features built in. Privacy focused browsers like DuckDuckGo and others, can be obtained for free and have capabilities that make browser fingerprinting difficult.

Provide User Interaction Data

Some more sophisticated anti-bot controls look at user interaction data (mouse, touch and keyboard events) to identify and mitigate bots. The more sophisticated bots can bypass these controls by providing high fidelity user interaction data including realistic mouse movements and keystrokes. These come with appropriate realistic timings and randomizations.

Conclusion

There are several controls and approaches that are used to detect and mitigate the bot-driven creation and activity of fake accounts. They range in their cost, technical sophistication, and efficacy. The most efficacious of these are advanced anti-bot defense products provide superior long-term efficacy but are typically expensive and time consuming to deploy and maintain. There are also cheap and quick solutions like honey pot fields and CAPTCHA that have limited efficacy and are ineffective against sophisticated bots.

Sophisticated bots use techniques to bypass existing controls to create and manage large cohorts of fake accounts, making their detection and mitigation difficult. These techniques range from using high quality email domains, to having capabilities to defeat the most common anti-bot defense methods like JS and CAPTCHA challenges.

To effectively protect against fake account creation bots, an advanced and robust anti-bot solution should be implemented. This is a solution that goes beyond the basic anti-bot techniques discussed in this article. Such solutions can be purchased or developed if the requisite expertise can be found internally.

Previous article in this series