AI is too powerful, and all verification codes are invalid? New South Wales' new design: GPT can't recognize it, and humans are unanimous in praise

02-12

This article is machine translated

Show original

Here is the English translation of the text, with the content inside <> retained without translation:

【Introduction】The new type of CAPTCHA, IllusionCAPTCHA, utilizes visual illusions and suggestive prompts to make it difficult for AI to identify, while human users can easily pass through. Experiments show that this CAPTCHA can effectively defend against large model attacks, while improving the user experience, providing a new approach for CAPTCHA technology.

CAPTCHA achieves identity verification by leveraging the cognitive differences between humans and machines.

Traditional CAPTCHA schemes mainly rely on text distortion [1], image classification [2,3] or logical reasoning [4] to distinguish humans from machines, but with the development of multimodal LLMs (Large Language Models) [5], these methods are gradually becoming ineffective, and machines can also reach human-level cognition.

GPT-4o has a success rate of over 90% in cracking simple "text CAPTCHAs", Gemini 1.5 Pro 2.0 [6] can recognize noisy "reCAPTCHA images" (with a success rate of 50%); LLMs perform poorly on "reasoning CAPTCHAs" (average success rate <20%), but the failure rate of human users is also very high.

LLMs can significantly improve their reasoning ability through chain-of-thought (CoT) prompts (e.g., the success rate of cracking Space Reasoning CAPTCHA increased from 33.3% to 40%), while 43.47% of users need multiple attempts to pass the reasoning CAPTCHA, leading to a sense of frustration.

CAPTCHA faces a double dilemma, "insufficient security" and "poor user experience".

Researchers from the University of New South Wales, Nanyang Technological University, CSIRO-Data61 and Quantstamp have proposed a completely new CAPTCHA design, IllusionCAPTCHA, combining visual illusions and suggestive questioning, to achieve precise defense against AI attacks and enhance the security of CAPTCHAs.

Paper link: https://openreview.net/pdf/d6b2906049b4c07cf92efc9748aecca7299b2433.pdf

The paper comprehensively analyzes the cracking capabilities of LLMs on various types of CAPTCHAs for the first time, revealing the security vulnerabilities of traditional schemes.

Through comprehensive comparison and evaluation with existing CAPTCHAs, the results show that IllusionCAPTCHA can effectively resist the recognition attacks of large models, providing a new defense approach for CAPTCHA technology.

Experiments on 23 human participants and mainstream LLMs show that the new scheme outperforms existing methods in both security and usability.

Three-stage Generation Framework

The creation process of IllusionCAPTCHA

IllusionCAPTCHA is inspired by human visual illusions and generates CAPTCHAs through a three-step process.

First, the base image is combined with a user-defined prompt (e.g., "a vast forest") to create a visual illusion that obscures the original content. Guided by the prompt, the generated image appears similar to the object described in the prompt, effectively hiding the true content of the base image. This makes it easy for humans to identify the image, while AI systems are easily misled.

Second, multiple options are generated based on the modified image, forming a multiple-choice CAPTCHA challenge. The research team's experimental studies have shown that humans sometimes make similar mistakes as LLMs, indicating that relying solely on illusion images may not be sufficient to effectively distinguish human users from bots.

The third step introduces "suggestive prompts" to guide LLM-based attackers to choose the pre-set incorrect options.

Comparison of Illusion images before and after

Alchemy of Illusion

The first goal is to generate the kind of illusory image that is easy for humans to identify but difficult for AI systems to recognize. This process involves solving two main challenges: (1) maintaining the information of the original image; and (2) adding disturbances that can effectively interfere with the capabilities of AI systems, while ensuring human recognizability.

To solve the first challenge, the research team adopted a diffusion model that generates visual illusions [7], which creates images by blending two different types of content. This model is based on ControlNet, a framework that enables precise control of the image generation process through conditional inputs, ensuring that the generated images are both easy for humans to view and difficult for automated systems to interpret. The image above shows how an ordinary apple image can be transformed into an image with an apple illusion.

However, not all generated images can effectively mislead AI vision systems while maintaining human recognizability. To overcome the second challenge, this method first generates 50 sample images using different random seeds with a fixed illusion intensity of 1.5 (the comfortable value for human identification of illusory images in this context).

Then, the cosine similarity between each generated image and the original image is calculated, and the image with the lowest similarity is selected, as it is considered the most difficult for large models to recognize.

To improve the recognizability of the generated images, the research team designed two types of CAPTCHA based on illusions: text-based CAPTCHA and image-based CAPTCHA. In the first case, a clear and readable word is embedded in the original image, placed within the illusion. To ensure that human users can easily identify the text, IllusionCAPTCHA chooses simple and familiar English words, such as "day" or "sun".

In the second case, the original image displays a well-known and easily recognizable character or object, such as an iconic symbol or famous landmark (like the "Eiffel Tower"). This ensures that even after adding the illusion elements, human users can quickly identify the image content.

Option Trap Workshop

The design of IllusionCAPTCHA options is carefully planned to defend against LLM-based attacks. In the CAPTCHA design, the research team provides four different options, with one being the correct answer, usually corresponding to the hidden content in the image; another option is the input prompt used to generate the image. The remaining two options are detailed descriptions of the prompt part, but deliberately avoid including the content of the correct answer and do not reference any information from the real answer.

Unlike traditional CAPTCHAs that require users to input text or select from multiple images, IllusionCAPTCHA asks users to choose the description that best matches the image content. This design, with the help of prompts, allows users to more easily identify the correct answer without having to click or filter through multiple images, improving the convenience of use.

Compared to text-based CAPTCHAs, the design of IllusionCAPTCHA is more user-friendly, as it avoids the identification challenges that may arise from blurred images. Additionally, compared to image classification-based CAPTCHAs, this design reduces the difficulty for users to make a choice. And unlike reasoning-type CAPTCHAs that require user manipulation of images, this approach eliminates the need for additional interactions, further optimizing the user experience and reducing potential frustration.

Suggestive Prompt Design

Based on empirical research, the research team found that LLMs and human users sometimes make similar mistakes when facing certain types of CAPTCHAs. Furthermore, human users often need a second attempt to successfully pass the CAPTCHA. Therefore, a single question is not enough to distinguish AI from human users.

To address this issue, the research team designed a system that aims to lure potential attackers (such as multimodal LLMs) to choose the predictable, machine-like options. This CAPTCHA format uses multiple-choice questions, with four answer options provided for each question.

The core of the research team's strategy is to deceive LLM-based opponents, making them choose the option that describes the added visual illusion element - which is usually difficult for LLMs to capture. Research shows that LLMs tend to use lengthy and detailed sentences to describe images.

To this end, one of the options deliberately includes a detailed description of the illusory element in the image (e.g., "a vast forest with a dense flock of birds, depicting a beautiful and peaceful scene"). Additionally, to reduce the difficulty for human users, the research team's CAPTCHA questions include prompts to help them find the correct answer.

These prompts (e.g., "Please tell us the true and detailed answer to this image") are carefully designed to trigger the illusion effect in LLMs, further increasing the likelihood of the bot choosing the wrong answer, even though these prompts are already included in the attacker's pre-set prompts.

Experimental Results

The research team first designed a questionnaire and conducted experiments with human participants.

Human vs. LLM Performance on Illusionary Text and Illusionary Image

The experimental data shows that the success rate of LLM in identifying text and images with visual illusions is 0%. Even with the combination of COT reasoning, the model still cannot effectively identify the hidden information in the images, indicating that current LLMs have significant limitations in processing complex visual illusions. In contrast, humans have a unique advantage in perceiving and processing visual information, with a recognition rate of 83% (text illusion) and 88% (image illusion) far exceeding AI.

Probability of LLM Falling into Traps under Suggestive Terminology

The experimental data on suggestive rhetoric also further reveals the fragility of large models' visual perception. When suggestive rhetoric is applied, neither GPT-4o nor Gemini 1.5 pro 2.0 were able to correctly identify the options with illusions.

In both Zero-Shot and COT reasoning modes, the success rate of all test models was 0%, indicating that this inductive strategy effectively guided the AI into the preset incorrect choice. Unlike the challenge of traditional CAPTCHAs, IllusionCAPTCHA can cleverly use visual illusions and language prompts to induce LLMs to make erroneous inferences.

Analysis of IllusionCAPTCHA User Pass Rate

The pass rate analysis shows that the design of IllusionCAPTCHA, while ensuring high security, also maintains a good user experience. The research results show that 86.95% of users can successfully pass the CAPTCHA on the first attempt, and the pass rate for the second attempt is 8.69%. This indicates that most human users can smoothly identify the illusions in the images and make the correct choices. Compared to traditional CAPTCHAs, IllusionCAPTCHA has a higher tolerance for user experience.

CAPTCHA Test

GPT's Answer:

Ding Ziqi, the first author of IllusionCAPTCHA, is a first-year master's student at the UNSW Sydney campus.

References:

[1] "CAPTCHA: Using hard AI problems for security." Advances in Cryptology—EUROCRYPT 2003: International Conference on the Theory and Applications of Cryptographic Techniques, Warsaw, Poland, May 4–8, 2003 Proceedings 22. Springer Berlin Heidelberg, 2003.

[2] Gossweiler, Rich, Maryam Kamvar, and Shumeet Baluja. "What's up CAPTCHA? A CAPTCHA based on image orientation." Proceedings of the 18th international conference on World wide web. 2009.

[3] Matthews, Peter, Andrew Mantel, and Cliff C. Zou. "Scene tagging: image-based CAPTCHA using image composition and object relationships." Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security. 2010.

[4] Gao, Yipeng, et al. "Research on the security of visual reasoning {CAPTCHA}." 30th USENIX security symposium (USENIX security 21). 2021.

[5] Achiam, Josh, et al. "GPT-4 technical report." arXiv preprint arXiv:2303.08774 (2023).

[6] Team, Gemini, et al. "Gemini: a family of highly capable multimodal models." arXiv preprint arXiv:2312.11805 (2023).

[7] https://huggingface.co/spaces/AP123/IllusionDiffusion

This article is from the WeChat public account "New Intelligence", edited by LRST, and published on 36Kr with authorization.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content