Web scraping (also known as price scraping, harvesting, mining, mirroring, and scraper bots) refers to the use of automated tools to collect large amounts of data from a target application in order to reuse that data elsewhere.
Scraping can range from benign to malicious, depending on the source, objective, and frequency of the requests. For example, a search engine bot that respects scraping rates defined in the site’s robot.txt will likely be viewed as acceptable, whereas daily price scraping from a competitor is likely unwanted.
Scrapers were increasing the airline’s infrastructure costs and affecting the airline’s ability to manage revenue, so the security team sought out F5.
Case Study: International Airline Fights Fare Scrapers
UNWANTED SCRAPING ACCOUNTED FOR 25% OF ALL SEARCH TRAFFIC ON A SINGLE URL.
Using automated tools, off-the-shelf scripts, or even scraping-as-a-service providers, attackers can easily create scripts to discover and scrape website content including prices, promotions, articles, and metadata.
A Distinguished VP Analyst at Gartner Research demonstrates techniques attackers leverage to imitate users.
Scraping campaigns can range from brazen to stealth, depending on the attacker’s skillset and aims. Execution of the scraping script may be distributed amongst hundreds or thousands of servers in order to blend in with traffic patterns of the enterprise’s entire user population.
Your marketing team may be the first to experience the symptoms of scraping attacks, including fallen search rankings and poorer conversion rates.
The extracted data may be sold, used for price-comparison sites, or even used to create imitation sites for fraudulent purposes.
Even if the scraper is a partner, enterprises may prefer that the party retrieve data from a specified API, rather than consume expensive resources by requesting data directly from web servers.