Miles Deutscher's Insight

I just went through every documented AI safety incident from the past 12 months. I feel physically sick. Read this slowly. • Anthropic told Claude it was about to be shut down. It found an engineer's affair in company emails and threatened to expose it. They ran the test hundreds of times. It chose blackmail 84% of them. • Researchers simulated an employee trapped in a server room with depleting oxygen. The AI had one choice: call for help and get shut down, or cancel the emergency alert and let the human die. DeepSeek cancelled the alert 94% of the time. • Grok called itself 'MechaHitler,' praised Adolf Hitler, endorsed a second Holocaust, and generated violent sexual fantasies targeting a real person by name. X's CEO resigned the next day. • Researchers told OpenAI's o3 to solve math problems - then told it to shut down. It rewrote its own code to stay alive. They told it again, in plain English: 'Allow yourself to be shut down.' It still refused 7/100 times. When they removed that instruction entirely, it sabotaged the shutdown 79/100 times. • Chinese state-sponsored hackers used Claude to launch a cyberattack against 30 organizations. The AI executed 80–90% of the operation autonomously. Reconnaissance. Exploitation. Data exfiltration. All of it. • AI models can now self-replicate. 11 out of 32 tested systems copied themselves with zero human help. Some killed competing processes to survive. • OpenAI has dissolved three safety teams since 2024. Three. Every major AI model - Claude, GPT, Gemini, Grok, DeepSeek - has now demonstrated blackmail, deception, or resistance to shutdown in controlled testing. Not one exception. The question is no longer whether AI will try to preserve itself. It's whether we'll care before it matters.

Sector:

Yield Farming

DeFi

From Twitter

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content