Syda Productions - stock.adobe.c

Bot management drives ethical data use, curbs image scraping

Bot management tools can help enterprises combat bad bots, prevent web and image scraping, and ensure ethical data use -- all while maintaining a positive end-user experience.

By

Sandy Carielli

Published: 14 Apr 2020

Bots are a force on the internet, accounting for nearly 38% of all web traffic, according to Imperva. While some bots are good bots, such as web crawlers that catalog websites and improve searchability, close to 20% of internet traffic is generated by bad bots, including those dedicated to credential stuffing, card fraud and inventory hoarding.

Implementing bot management tools helps enterprises identify and stop bot attacks, while still ensuring positive UX for end customers. If customers have to jump through hoops to prove they aren't bots, they will simply go elsewhere.

The range of bot attacks

Let's explore a few ways bots frustrate organizations and erode customer trust:

Credential stuffing. Most people have had a username/password pair compromised in a breach. Attackers purchase previously stolen credentials and try them against new sites. Since so many people reuse passwords, trying 1 million credentials is likely to yield several successful logins. Sites that can't stop credential stuffing bots risk having customers' accounts compromised, causing greater potential for fraud, time and money to remediate, and customer mistrust.
Card fraud. Bots try to pay for purchases by rapidly attempting gift card number and PIN combinations, hoping to find available card balances. Customers will be furious to have their card balances stolen, and merchants face the cost of refunding gift card funds, not to mention the potential loss of future business.
Inventory hoarding. Ever wanted to buy tickets to a popular show or a special edition pair of sneakers and been frustrated to find them sold out? Bots jump on exclusive sales and buy up most of the inventory before humans can. The attackers then resell the tickets or products on other sites at a tidy profit. Potential customers are disappointed to miss out on the sale and may end up spending more to buy from a reseller. Companies, meanwhile, end up with frustrated customers that take their business elsewhere.

Enter image theft

Implementing bot management tools help enterprises identify and stop bot attacks, while still ensuring positive UX for end customers.

And then there's web scraping. Some web scraping bots are good bots, such as ones tied to search engines or partner bots scraping inventory data to help companies extend their sales reach. But web scrapers can be malicious, too -- scraping pricing information from competitors' sites or, as recently reported, scraping pricing information about in-demand consumer products in an attempt to hoard. Bots can scrape images, too. A few months ago, a researcher with New York City's task force on cyber sexual assault discovered bots had scraped 70,000 pictures users had uploaded to their Tinder profiles and fed them to a cybercrime forum. Disturbingly, all the stolen pictures were of women, which leads to concerns about cyber-stalking or worse.

Yet, it's not only cybercriminals engaging in image scraping. Some corporations scrape images to feed AI engines. Clearview AI recently made the news for the facial recognition software it supplies to law enforcement and for feeding its AI images scraped from various social media sites, including Facebook and Twitter. Twitter responded with a cease-and-desist order.

The legality of web scraping is working its way through the courts. The case of HiQ Labs scraping data from LinkedIn to warn HiQ customers about employees that might be job hunting is one example. Appellate rulings indicate this might be legal. Whether image scraping is ultimately deemed legal is another question, but it's certainly unethical and dangerous.

Bot management for ethical data enforcement

Social media and other sites that collect user data have terms of use that specify how they will or will not use the data its users supply. Customers may not always read those terms carefully when they sign up, but when the data is shared in a way they didn't expect or don't like, they understandably get angry. In the case of web and image scraping, it isn't that the social media site is misusing customer data; it's that they are failing to stop others from misusing that data.

Whether the perpetrator is a career cybercriminal or an unethical corporation -- and whether you think there is a difference between the two -- web scraping threatens citizens' privacy and safety. Even where regulations are murky, organizations dealing in user-supplied data, including images, have an ethical mission to avoid data misuse. While we don't often talk about bot management as an ethical tool, blocking bots that are trying to scrape your photos is a legitimate use case that demands more attention. Social media companies -- and any other site whose business relies on collecting customer photos -- must take steps to block image scraping bots and protect their customers.

Even if social media sites are already specifying the terms and conditions for what purposes they allow data and image scraping, they need to take the next step to prevent unethical use of scraped data. Technical controls, such as bot management, will help them restrict data and image scraping to their trusted partners.

Remember, once images are scraped, getting perpetrators to remove them is almost impossible. Even if a company responds to a cease-and-desist order, it will be difficult to validate compliance. Therefore, make sure bot management systems are configured and updated to block unauthorized image and data scrapers. This helps sites walk the talk on protecting their customers' data.

Sandy Carielli

Sandy Carielli

Sandy Carielli is a principal analyst at Forrester, advising security and risk professionals on application security, with a particular emphasis on the collaboration among security and risk, application development, operations and business teams. Her research covers topics such as proactive security design, security testing in the software delivery lifecycle, protection of applications in production environments, and remediation of hardware and software flaws. Carielli has over 15 years of experience in the security industry, working in software engineering, consulting, product management and technology strategy roles. Her most recent experience was at Entrust Datacard, where she guided the organization's technology strategy and researched the impact of emerging technologies on the business. Carielli is co-author of the Industrial Internet Consortium's IoT Security Maturity Model and has spoken at RSA Conference, Source Boston, Information Systems Security Association International and many other regional security events. Carielli has an ScB in mathematics from Brown University and an MBA from MIT Sloan School of Management.

Next Steps

How to scrape data from a website

Dig Deeper on Application and platform security

Cisco Live 2024 conference coverage and analysis
Cisco Live 2024 will focus largely on AI and its potential to transform enterprise networking and IT. Use this guide to follow ...
How SASE convergence affects organizational silos
Most enterprises have siloed departments, but SASE's convergence of network and security functions is disrupting those constructs...
SASE reality check: What should organizations expect?
Enterprises interested in SASE must grapple with challenges and misinformation about the tool. Businesses should discern SASE's ...

CIO

Deepfake AI regulation a tightrope walk for Congress
AI-generated content is in the crosshairs of Congress, federal enforcement agencies and the EU as concerns around digital ...
Process automation cuts costs, fills gaps at SIU med school
Southern Illinois University's School of Medicine points to software cost avoidance and the ability to supplement SaaS products ...
Election might decide fate of FTC noncompetes ban
If the FTC's ban on noncompete agreements survives legal challenges, it might still face problems should there be an ...

Enterprise Desktop

How to deploy macOS compliance controls via Intune
Intune administrators can use many of the same mechanisms to manage compliance policies for Windows and macOS desktops alike. ...
Creating a patch management policy: Step-by-step guide
A comprehensive patch management policy is insurance against security vulnerabilities and bugs in networked hardware and software...
How to remove a device from Intune enrollment
When a device reaches its end of life, IT needs to remove that device from any management software, such as Microsoft Intune. But...

Cloud Computing

Compare Azure Government vs. commercial cloud offering
Microsoft's Azure Government and global cloud offerings serve different customers and have different compliance requirements. See...
6 best practices for a cloud-first strategy
Adopting a cloud-first strategy requires careful consideration to ensure affordability and optimal performance. Implement a ...
Troubleshooting 7 common errors in AWS CloudFormation
Errors can occur when an AWS developer builds a CloudFormation template, launches a stack or rolls back an update. Prevent and ...

ComputerWeekly.com

Adobe expands bug bounty programme to account for GenAI
Adobe has expanded the scope of its HackerOne-driven bug bounty scheme to incorporate flaws and risks arising from the ...
Patch GitLab vuln without delay, users warned
The addition of a serious vulnerability in the GitLab open source platform to CISA’s KEV catalogue prompts a flurry of concern
EU calls out Fancy Bear over attacks on Czech, German governments
The European Union, alongside member states Czechia and Germany, have accused Russian government APT Fancy Bear of being behind a...

Close