What exactly is the AIGC test that keeps students up at night?

This article is machine translated

Show original

With the 2026 graduation season approaching, AIGC (AI-generated content) detection in universities has become a new challenge for graduating students. Sichuan University requires that AI-generated content in humanities papers not exceed 20%, while Guangxi Normal University and other schools limit it to within 40%. AIGC detection is based on two dimensions: "perplexity" and "suddenness," but due to the rapid iteration of large AI models, its accuracy is highly questionable. CCTV has specifically discussed the scientific validity of this technology, and even classic literary works such as "Preface to the Pavilion of Prince Teng" have been mistakenly judged as 100% AI-generated. The industry is currently turning to digital watermarking technology to address the problem at its source, but this solution has significant limitations in the text domain.

Article author and source: Sanyi Life

Once you understand the principles behind AIGC detection, you can find the right solution.

"Outside the long pavilion, along the ancient road, fragrant grasses stretch to the horizon. I ask you, when will you return? When you come back, do not hesitate." With another graduation season approaching, many graduating students in 2026 may face not only the sorrow of parting, but also the bewildering AIGC (AIGC Qualification Examination) test. This year, one topic has been repeatedly discussed in group chats across various universities: "How do I pass the AIGC test?"

If we were to choose which group is embracing AI the most right now, it might not be the working class who are worried about being replaced by AI, but rather the students still in university. There's even a joke that university classrooms are no longer a competition of who studies harder, but a competition of who is better at using AI prompts.

In the fourth year since the AIGC (Generative Artificial Intelligence) revolution began, a large number of universities have started to incorporate AIGC detection into the review process for graduation theses, and it is no longer a matter of "may be checked", but "will definitely be checked".

For example, Sichuan University requires that the proportion of AI-generated content in humanities graduation theses not exceed 20%, and that in science, engineering, and medicine not exceed 15%; Guangxi Normal University and Nanjing University of Aeronautics and Astronautics stipulate that the proportion of AIGC content should not exceed 40%. Furthermore, for unqualified theses, the current penalty measure for universities is not to return them for revision, but to postpone the defense.

Currently, searching for "AIGC detection" on platforms such as Xiaohongshu, Weibo, and Douyin reveals numerous posts from recent graduates complaining about their high AIGC rates. This has also led to advertisements for unorthodox methods to reduce AIGC rates, tools for lowering AIGC rates, and skepticism towards AIGC detection.

One Xiaohongshu user shared their experience: VIP AIGC detected an AI rate of 48% in their paper, and no matter how they modified it, even using paid tools, the AI rate could only be reduced by a few percentage points. Then, they had a flash of inspiration: without changing a single word, they replaced all commas in the entire text with periods using a single search, and the AI rate immediately dropped to 11.51%.

Furthermore, someone submitted Zhu Ziqing's classic essay "Lotus Pond by Moonlight" to an AIGC detection tool, and it was surprisingly judged to be 62.88% AI-generated. Not only that, Liu Cixin's "The Wandering Earth" was also detected to have over 50% AI content, while the timeless masterpiece "Preface to the Pavilion of Prince Teng" was even marked as 100% AI.

The chaos surrounding AIGC detection has even prompted CCTV to discuss the scientific validity of this technology. So why is AIGC detection so chaotic, with each result seemingly random? Before answering this question, let's first look at the principles of AIGC detection. Currently, mainstream AIGC detection methods on the market are basically based on "perplexity" and "burstiness" to measure the predictability and frequency changes of words.

In the human cognitive world, text is the carrier of semantics and logic, but in the eyes of large AI models, the world is reduced to tokens. Large AI models based on the Transformer architecture essentially calculate the probability distribution of each possible token in the context and select the token with the highest probability to generate the output. This process relies on statistical prediction rather than the AI truly understanding the meaning of the relevant words.

For example, if the AI receives the text "I want to eat Yu Xiang Rou Si" (鱼香肉丝), it will assume that the next word you type is "shredded pork" (肉丝). Some smart AIs will even provide a home-style recipe for Yu Xiang Rou Si. Because AI chooses to use the most probable word when writing, the article will have a low "confusion level" in the eyes of AIs specifically trained for AIGC detection tools.

Suddenness refers to the structural regularity of an article. An article with an overly perfect rhythm, overly regular logic, and overly standard word choice and sentence structure will have a high suddenness. For example, the fact that AIGC detected 100% AI rate for "Preface to the Pavilion of Prince Teng" is not because it is hailed as "the greatest parallel prose of all time," but because its characteristics such as harmonious rhyme and parallelism make the article too perfect in the eyes of AI.

In short, the underlying logic of AIGC detection is "guessing," that is, guessing how similar your article is to the output of the AI model. It's not that the developers of AIGC detection aren't working hard, but rather that the rapid iteration of AI models has made "AI detecting AI" a mere empty phrase.

In fact, the AI industry has yet to produce a truly reliable AI content inspection tool. Currently, the prevailing approach is AI digital watermarking technology, which involves adding an invisible watermark to the metadata of generative AI-generated images and videos, attempting to solve the problem at its source. To this end, companies like Microsoft, Google, Adobe, OpenAI, and Meta have formed the C2PA (Content Source and Authenticity Alliance), and AI tools such as ChatGPT and Gemini have already integrated C2PA content credentials.

However, C2PA also has limitations. While it is highly effective in the fields of images and videos, it has shortcomings in the text field because the latter is too easily tampered with. The reality is that there is no 100% accurate AIGC detection tool on the market. These tools actually give the probability that the content has an AI style, not the probability that the content was directly created by AI.

The problem is that AIGC plagiarism detection cannot provide accurate judgments, and it's also true that universities require papers to pass AIGC detection. So, it's no wonder that graduating students are worried. Since schools only accept the results of AIGC detection, everyone still has to find ways to reduce plagiarism, but spending money on dedicated AI plagiarism reduction tools is unnecessary.

Paid AI-powered deduplication tools, which target sentence disruption and word replacement, are no longer effective against AIGC detection in 2026. Simply changing "important" to "key" or "therefore" to "so" will now be detected; only semantic reconstruction is effective. For "suddenness," alternating between short and long sentences can change the writing rhythm, and the density of transition words should be minimized.

For the more challenging "confusion level," there's a hidden trick: use more first-person perspectives and critical expressions. Currently, large AI models often cater to users to improve retention rates, so they tend to agree with the user rather than use critical expressions.

At the same time, to ensure the objectivity of the answers, or at least to give users the impression of objectivity, the AI will not actively use a first-person perspective when communicating with users, but will instead use a more detached third-person perspective. These characteristics provide a way to address "confusion level".

[Images in this article are from the internet]

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content