A CAPTCHA is a type of challenge-response system designed to differentiate humans from robotic software programs. CAPTCHAs are used as security checks to deter spammers and hackers from using forms on web pages to insert malicious or frivolous code.
How CAPTCHAs work
CAPTCHAs are a kind of Turing test. Quite simply, end users are asked to perform some task that a software bot cannot do. Tests often involve JPEG or GIF images, because while bots can identify the existence of an image by reading source code, they cannot tell what the image depicts. Because some CAPTCHA images are difficult to interpret, users are usually given the option to request a new test.
Types of CAPTCHAS
The most common type of CAPTCHA is the text CAPTCHA, which requires the user to view a distorted string of alphanumeric characters in an image and enter the characters in an attached form. Text CAPTCHAS are also rendered as MP3 audio recordings to meet the needs of the visually impaired. Just as with images, bots can detect the presence of an audio file, but only a human can listen and know the information the file contains.
Picture recognition CAPTCHAs, which are also commonly used, ask users to identify a subset of images within a larger set of images. For instance, the user may be given a set of images and asked to click on all the ones that have cars in them.
Other types of CAPTCHAs include:
- Math CAPTCHA - Requires the user to solve a basic math problem, such as adding or subtracting two numbers.
- 3D Super CAPTCHA - Requires the user to identify an image rendered in 3D.
- I am not a robot CAPTCHA - Requires the user to check a box.
- Marketing CAPTCHA - Requires the user to type a particular word or phrase related to the sponsor's brand.
Advantages and disadvantages of CAPTCHA
Advantages to using CAPTCHAs include:
- The prevention of spam from automated programs that could send emails, comments or advertisements.
- The prevention of fake registrations or signups for websites.
- Most people know what CAPTCHAs are, so visitors to a website will automatically understand what they are tasked to do.
- CAPTCHAs are also easy to implement in building a website.
Disadvantages to CAPTCHAs include:
- CAPTCHAs are not foolproof, and can only limit spam.
- Can be time-consuming or annoying to end-users.
- To some people, CAPTCHAs may be challenging to read.
- Websites using CAPTCHAs may notice traffic decrees due to difficulty or annoyance from the end-user.
How attackers defeat CAPTCHAs
There are multiple ways that attackers can get around CAPTCHAs, such as using machine learning algorithms. The use of machine learning algorithms is considered to be a fast and accurate way of defeating a CAPTCHA. Attackers can use either a deep learning model which downloads a large collection of CAPTCHA examples to have the model learn how to solve a CAPTCHA or use a generative adversarial network (GAN) to create CAPTCHAs to then learn how to solve them.
Users who prefer not to solve CAPTCHAs can use any of several browser add-ons that allow users to bypass CAPTCHAs. Popular browser add-ons include AntiCapture, CAPTCHA Be Gone and Rumola.
The AntiCaptcha automatic CAPTCHA solver plug-in for Chrome and Firefox automatically finds a CAPTCHA on a webpage and solves it for the user. This extension is promoted as being helpful for users with vision impairments, as well as for users who prefer to bypass CAPTCHA codes. As of this writing, the cost of the service starts at $0.70 for 1,000 CAPTCHA images.
The CAPTCHA Be Gone extension detects CAPTCHAs on webpages, solves them and copies the result to a user's clipboard. At this time, the utility is available for Firefox, Chrome and Internet Explorer for a $3.50 per month subscription fee.
Because CAPTCHA bypass add-ons are created by third parties, end users should be aware that the browser extension could expose their browsing activity to untrusted sources. Another reason not to use CAPTCHA bypasses is that the performance of the extensions is inconsistent. This is primarily because as bots get smarter, CAPTCHAs are also evolving and it can be difficult for the add-on programs to keep up.
History of CAPTCHA
The need for CAPTCHAs began as far back as 1997. At that time, the internet search engine AltaVista was looking for a way to block automated URL submissions to the platform which were skewing the search engine's ranking algorithms. To solve the problem, Andrei Broder, AltaVista's chief scientist, developed an algorithm that randomly generated an image of printed text. Although computers could not recognize the image, humans could read the message the image contained and respond appropriately. Broder and his team were issued a patent for the technology in April 2001.
In 2003, Nicholas Hopper, Manuel Blum, Luis von Ahn of Carnegie Mellon University and John Langford of IBM perfected the algorithm and coined the term CAPTCHA. The name stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart.
Jason Polakis, a professor in computer science, took credit for an increase in CAPTCHA difficulty in 2016 when he published a paper where he used image recognition tools to solve Google image CAPTCHAs with an accuracy of 70%. Polakis believes we are at a point where making CAPTCHAs harder for software to solve, will now simultaneously make it more difficult for humans to solve.