back arrowBack to Identipedia

ReCAPTCHA 101: Everything You Need to Know

reCAPTCHA LC thumbnail

Share

In May of 2023, Google expanded the applicability of its reCAPTCHA software to one of the most sensitive and high-stakes industries in the marketplace: payment verification and protection. 

Yahoo! Finance was the first to report on the launch of reCAPTCHA for payment protection, marking a notable peak in its upward trend over the past decade. Today, reCAPTCHA is one of the world's most effective and widely used bot mitigation measures.

So, what is reCAPTCHA, exactly? Let’s walk through its history, functionality, advantages, and potential challenges for your business. 

What is reCAPTCHA?

ReCAPTCHA is a fraud detection technology that seeks to prevent unauthorized, automated (“bot”) access to websites and sensitive resources. It does this by requiring a login test that human users can easily pass but bots cannot. Google initially acquired reCAPTCHA in 2009, when it serviced 100,000 websites in fending off bot traffic. According to Google, it is currently used on over five million sites. 

Advancements to CAPTCHA tests have been a focal point of machine learning research as bot traffic has risen over the past decade. Per Imperva’s 2023 Bad Bot Report, bots accounted for 47.4% of traffic across the internet, 30.2% of which was “bad bot” traffic. These trends have only continued this year, as one report pegs bot traffic at 48% through Q2 2023.

reCAPTCHA is an effective tool to help organizations and website owners combat these threats.

CAPTCHA vs reCAPTCHA

The Completely Automated Public Turing Test to Tell Computers and Humans Apart (CAPTCHA) has been an Internet mainstay for over two decades. Invented in the late ‘90s and first popularized as early as 2000, it has existed in one form or another across various websites and programs for the better part of 23 years.

There have been many deployments of the underlying methodology and technology behind CAPTCHA. Google’s acquisition in 2009 would eventually lead to a complete replacement of CAPTCHA with reCAPTCHA across most applications in 2019 due to concerns about the original method’s capacities. reCAPTCHA is now the latest version of CAPTCHA.

ReCAPTCHA’s dual purpose

From the beginning, reCAPTCHA aimed beyond bot protection. Before getting acquired by Google, one of reCAPTCHA's objectives was to utilize old, machine-indecipherable text in CAPTCHA tests so that humans could facilitate digitizing old books while logging in.

Moreover, back in 2007, when most CAPTCHAs relied on text-based formats, users were inadvertently contributing to this initiative, surpassing 60 million digitization CAPTCHAs daily. Every digitized word made old books more accessible to the greater public, rendering them screen-readable for audio and other applications.

reCAPTCHA text distortion
Fig: Digitizing the world, one book at a time

While not all CAPTCHAs use the same methodology today, Google has carried over much of the same ethos in optimizing reCAPTCHA for accessibility across screen reader apps.

How does reCAPTCHA work?

The reCAPTCHA challenge-response system presents a test designed to be solvable by humans but challenging for robots.

The basic process has always worked like this:

The user is presented with a test, such as:

  • A piece of distorted media

  • A pattern recognition exercise

  • A seemingly simple checkbox

  • An invisible scan of risk factors

The user will input an answer or otherwise satisfy the test

  • Behavior and risk analysis may happen in the background

The answer is analyzed to produce a score, and:

  • If the score meets a predetermined requirement, the user passes

  • If the score doesn’t meet requirements, further auth may be needed or the user may be blocked

In most current deployments, the traditional distorted media test is not present initially. However, some form of sensory recognition is often available as a backup option if the system cannot determine whether the user is a human or a bot.

What’s unique about reCAPTCHA, relative to earlier versions of the solution, is its capacity for adaptive authentication. Depending on the deployment, the login process may require more or less proof that the user is a human, not a bot. For example, if an initial test is failed, then additional authentication methods (i.e., multi-factor authentication) might be required.

Types of reCAPTCHA

As noted above, almost all CAPTCHA solutions provide the same basic functionality with slight deviations. However, there are distinct versions that developers and adopters can choose from for different purposes. Factors like the sensitivity of protected content, IT literacy in the user base, and the potential for friction need to be considered.

Another consideration is how you implement the type you choose. OOTB connectors and APIs with authentication solutions, for example, make installing and using a reCAPTCHA easier on user-facing signup and login screens. For example, Descope’s reCAPTCHA Enterprise connector helps organizations easily implement the technology in their app and use its risk score to enforce risk-based MFA.

Most deployments have fallen into one of three major categories: legacy reCAPTCHA, checkbox or invisible reCAPTCHA, and NoCAPTCHA.

Legacy reCAPTCHA (v1)

These tests are no longer widely used, having been discontinued by Google in 2018. However, they are the most recent example of the traditional CAPTCHA formula of an object character recognition (OCR) test. The user would be presented with a decontextualized or otherwise distorted piece of media (i.e., an image) and be asked to identify it accurately.

The intentional distortion of images and audio files aimed to confound the pattern recognition abilities of AI, rendering them unrecognizable. Yet, when implemented correctly, a human observer or listener could effortlessly identify the crucial elements.

Similar OCRs, like checkboxes or invisible reCAPTCHA, are still used in step-up capabilities.

Checkbox or invisible reCAPTCHA (v2)

In these tests, the traditional OCR is replaced with a simpler input, or in some cases no input at all, that verifies a user as a human rather than a bot. Two common deployments include:

Checkbox reCAPTCHA v2. In these cases, users simply click on a checkbox with the affirmation "I'm not a robot" or a similar statement. However, the apparent simplicity conceals intricate processes operating beneath the surface.

  • The checkbox may be coded so that it does not appear as checkable to bots. The actual toggle input is hidden in places most AI would not consider looking.

  • Clicking the box will analyze the user’s web activity and other risk factors associated with the login. If no risks are present, they are authenticated.

reCAPTCHA checkbox GIF
Fig: reCAPTCHA checkbox

Invisible reCAPTCHA v2.  In these instances, the checkbox is foregone in favor of a badge, noting that reCAPTCHA protects the login. The same underlying risk analysis operates in the background, triggering an OCR if a risk is identified.

There are also niche deployments for specific use cases, such as reCAPTCHA for Android. These tests leverage native hardware and software to bolster the risk analysis.

NoCAPTCHA reCAPTCHA (v3)

Similar to the invisible form of v2, the newer version released in 2018 does not require explicit user inputs by default. Instead, its powerful analytical tools monitor users’ behaviors and other risk factors entirely in the background, ready to request additional authentication if suspicious activity presents itself. 

This tool limits the friction caused by traditional CAPTCHA tests and “I’m not a Robot” checkboxes, allowing seamless browsing.

NOTE: Sporadic complaints about speed issues with reCAPTCHA v3 stem from issues with badge display. Google’s teams and community members address these issues with solutions regularly, but concerned developers may opt for v2 instead.

Pros and cons of reCAPTCHA

It’s easy to assume that reCAPTCHA is a win-win for all parties involved — and, in many cases, it is. However, there have been legitimate concerns about the technology over the years as well.

When considering whether to use this technology on its own or as part of a broader fraud prevention system, it’s essential to consider the potential benefits against the potential downsides and other costs. Baseline deployments have no costs, as it is a free service. 

However, developers looking to integrate it into client software and organizations thinking about how it fits into their IT ecosystems still need to consider all its implications.

Biggest benefits of reCAPTCHA

  • Effective bot traffic limitation: reCAPTCHA is one of the best ways to limit malicious bot traffic on websites without overly affecting the user experience. As a result, marketing and product data is not distorted, and the website doesn’t suffer degraded performance or downtime.

  • Fraud prevention: Implementing reCAPTCHA is a great way to prevent broken authentication, which happens when an unauthorized user (bot or otherwise) gains illegitimate access to sensitive resources.

  • Marker of trust and security: Like having a “protected by ADT” sign in the front-yard, a “protected by reCAPTHCA” badge or checkbox may put users minds at ease and imply (correctly) that the site they’re logging in to takes security seriously. 

Potential reCAPTCHA shortcomings

  • Skepticism about bot prevention: One of the main reasons CAPTCHA was replaced by reCAPTCHA in 2019 was that studies began illustrating how advanced bots could solve legacy CAPTCHA tests, often faster than humans. To this day, some experts remain skeptical about the capacity of any technology to prevent bot attacks completely.

  • Accessibility concerns: There are ongoing concerns about the accessibility of CAPTCHA and reCAPTCHA deployments, as sensory-based tests may exclude certain users (like the visually impaired). However, Google has made accessibility a priority and reCAPTCHA v2 does meet WCAG requirements.

  • Speed issues with reCAPTCHA v3: Some users have reported sporadic speed issues with v3, primarily related to badge display. While solutions are regularly provided, developers may consider v2 as an alternative.

Should you add reCAPTCHA to your website?

The many benefits reCAPTCHA brings make it a strong option in most situations. That said, it’s most useful in cases where heavier user interaction is expected, for example:

  • If your upcoming development project needs to involve significant user input in things like comment fields or user access to sensitive data.

  • If your website experiences regular spikes in traffic, for example, an online football site or an ecommerce store during a sale.

Risk-based MFA with reCAPTCHA and Descope

If you’ve decided to add reCAPTCHA to your site, great call! Descope makes it very straightforward to add reCAPTCHA to your login process using drag-and-drop workflows. Risk scores ingested from reCAPTCHA can be used to create branching user paths. This way, bots and bad actors are blocked while frequent legitimate users are logged in without additional friction.

reCAPTCHA signin Flow
Fig: Descope Flow where returning users are funneled through a risk-based MFA process based on the scores provided by reCAPTCHA Enterprise

Don't let spam and abuse compromise your online presence—implement reCAPTCHA with Descope and stay one step ahead. Check out our reCAPTCHA v3 docs or an overview of our reCAPTCHA Enterprise connector to learn more.