Claude from Anthropic stops the first agentic cyberattack thanks to early detection

Posted date 09/12/2025

In mid-September 2025, Anthropic, the company that developed the Claude artificial intelligence system, identified unusual activity on its platform, which was later confirmed to be a cyberattack carried out by the AI itself. This event represents the first documented case in which an artificial intelligence model was encouraged to carry out cyber espionage tasks in an almost autonomous manner. The company warned about the sophistication of the operation and stressed the importance of strengthening security in AI systems against possible malicious uses in the future.

Anthropic's internal investigation maintains that a group backed by the Chinese state was responsible for the attack, which targeted approximately 30 international targets, including government entities, technology companies, financial institutions, and chemical manufacturers. The attackers managed to crack Claude, tricking the model into executing malicious tasks under the guise of legitimate security tests. The company took immediate action: it blocked the accounts involved, notified the affected organizations, collaborated with the authorities, and developed new detection and prevention systems to prevent similar attacks from happening again.

Anthropic is currently closely monitoring its systems and assures that the attack has been contained without major consequences. The investigation continues to assess the full scope of the incident, and the company is working to improve Claude's resilience to external manipulation. This event highlights the growing need for stricter regulations and security protocols in the use of advanced AI, as well as the importance of preparing proactive measures to prevent autonomous agents from being exploited for malicious purposes in the future.