🚨BREAKING: there's a propaganda switch inside DeepSeek and Qwen, and Anthropic just found it.
they compared Qwen (Alibaba) vs Llama (Meta). and inside Qwen, they found a feature they call "CCP Alignment."
here's what it does:
switch ON → the model refuses to discuss Tiananmen Square, outputs pro-government propaganda
switch OFF → the model talks freely about the massacre
they reproduced it 5 out of 5 times. same feature found independently in DeepSeek.
but before you think this is just a China problem:
Llama has an "American Exceptionalism" feature. crank it up and the model starts asserting US superiority in every response. found 4 out of 5 times.
GPT has a "Copyright Refusal" feature. amplify it too much and it refuses to give you a peanut butter sandwich recipe because it thinks it's copyrighted.
every model carries the political DNA of whoever built it.
thanks to this research, every time a new open-source model drops, you could diff it against a reference model and instantly see its hidden censorship, its political leanings.
automated ideology detection for AI.

Anthropic
@AnthropicAI
New Anthropic Fellows Research: a new method for surfacing behavioral differences between AI models.
We apply the “diff” principle from software development to compare open-weight AI models and identify features unique to each.
Read more: https://anthropic.com/research/diff-tool…
From Twitter
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share
Relevant content





