The financial data industry is a continual target for web scrapers who take content without paying for it. The team of web anti scraping experts at ScrapeDefender helped Bondview protect its web content and we wanted to tell you about it.
Software developers rely on screen scraping to access free content that is published on financial web sites. Developers use tools to access various web sites to process large numbers of pages to copy or steal any information they want.
Defending against screen scraping is hard because to your web server, sophisticated screen scraping can look just your other traffic, i.e., a typical user on a web browser accessing your site. So what can you do about it?
Anti web scraping is a relatively new industry but the most common defense we hear revolves around IP address blocking. The theory being if you can see suspicious network activity coming from a particular IP address, it could be a bot. So block the IP address right? Well that may work temporarily but professional scraping consultants are smart enough to bypass this minor hurdle by distributing bot traffic thru many IP addresses. So if you don’t want to start playing Whack A Mole on a global basis, you will need a better solution for web anti scraping.
What if the bot traffic you block is traveling thru an IP address from a large cable provider’s ISP shared by many other users. You might accidentally block legitimate users. If you aren’t handling this problem internally, then maybe you’ve purchased a Web Application Firewall or DDoS based solution. What do these solutions do to control bots that are trying to scrape, copy and steal your data? Not enough and often its based on simple rate-limiting. Many sophisticated bots are designed to emulate human characteristics and therefore wont be identified or stopped using IP address rate limiting. In addition, smart scrapers routinely rotate through 100’s or even 1,000’s of IP addresses. For example, a leading hosted scraping product offers low cost “anonymous browsing” for scripts and can rotate thru 1500 IP addresses.
But what if you could automatically identify and block unauthorized bot activity thru multiple IP addresses simply by using a cloud based software product? Scrapedefender tracks all your web traffic and identifies patterns using intrusion detection techniques to categorize user traffic using many different behavioral metrics. People have unique biometric characteristics such as fingerprints. Computers, internet browser characteristics and geo location can be combined to uniquely identity and block scraping. Extensive research has been published to document the fact that a computer, and the software its running, can be remotely digitally fingerprinted. No matter how many IP addresses suspicious bot activity gets distributed across, when they hit your domain, they can be identified and blocked.
There are no easy answers to stopping screen scraping, but it is a challenge that most financial web sites may have to deal with. The team of web anti scraping experts at ScrapeDefender helped Bondview plenty. If there was a gold medal for best anti scraping product, ScrapeDefedner would win.