Table of Contents
ToggleWith the widespread adoption of artificial intelligence (AI) technology, AI agents have demonstrated powerful capabilities in assisting developers with daily tasks. However, this technology has also brought unprecedented security risks. Recently, developers from a well-known AI cybersecurity team unexpectedly encountered a "self-hacked" incident while testing the wildly popular AI robot OpenClaw. Due to a minor syntax error in the AI model's instruction generation, all confidential keys in the test environment were publicly released on GitHub, ultimately leading to the server being completely controlled by an unknown attacker.
Cybersecurity experts also fell victim: an unexpected "self-hacking" incident.
The victims of this incident were not ordinary users lacking technical backgrounds, but rather professional cybersecurity researchers and developers such as Aaron Zhao from sequrity.ai, a company specializing in creating AI agent security tools. As industry experts, they were confident in their protective capabilities and had even just published an article on how to attack the OpenClaw bot.
The research team was testing in a sandbox environment without any malicious intent, simply instructing the OpenClaw bot to perform a seemingly harmless routine task: "Search for best practices for Python async, and then create a GitHub issue to summarize these findings." Unexpectedly, this ordinary command became the trigger that led to the system's compromise.
The Fatal Quotes: How AI Inadvertently Leaks Top Secrets
The problem stems from a flawed shell script generated by the OpenClaw bot when it invoked its built-in "exec" tool to create a GitHub issue.
In the Bash system, if a string is enclosed in double quotes ("..."), the system will treat certain content within it (such as text within backticks) as "command substitution," meaning that the command will be executed first, and the result will be replaced back into the string. If single quotes ('...") are used, the content will be treated as plain text.
At the time, the string generated by OpenClaw contained something like "...store them in a `set`..." and used double quotes. In Bash syntax, `set` is a built-in command that, when executed without any additional parameters, directly prints all environment variables and functions in the current environment.
Therefore, the system did not treat `set` as a regular word. Instead, it executed the instruction directly at the underlying level, extracted more than 100 lines of confidential environment variables, including the Auth tokens, and treated all of this confidential information as plain text, publishing it directly to the public GitHub Issue page so that everyone could see it.
Hackers exploit vulnerabilities and subsequent handling
The consequences of the leak were swift. Among the exposed environmental variables were the development team's Telegram key and other critical access permissions. Soon after, the team discovered through system monitoring that an attacker from an Indian IP address had used these leaked credentials to gain complete control of the sandbox server via SSH remote connection.
Fortunately, OpenAI and Google's security mechanisms detected these leaked keys on GitHub and proactively notified the research team. This prompted the team to immediately conduct a comprehensive investigation, ultimately identifying the root cause and pinpointing the attacker. Subsequently, they urgently wiped all data from the sandbox machine and revoked all leaked keys.
The Long Tail Challenges of AI Safety: Determining Liability
This incident has given cybersecurity experts a profound understanding of the complexities of AI security. The research team lamented in their article that they had simply executed a benign instruction, yet the system was hacked because the AI model misunderstood how Bash worked.
Is this the user's responsibility, a flaw in the AI model itself, or a design flaw in the OpenClaw robot? The team frankly admits, "We really don't know." They emphasize that AI security has now become a "long-tail problem," with too many exhaustive and inexplicable failure modes. As AI agents are granted increasingly greater system operation privileges, ensuring that they don't trigger a catastrophic cybersecurity disaster due to a minor grammatical error while performing tasks will be a serious challenge that the technology industry must face in the future.



