Anthropic Exposes Large-Scale AI Capability Theft by Competitors

Anthropic has uncovered extensive, industrial-scale campaigns orchestrated by three AI laboratories – DeepSeek, Moonshot, and MiniMax – aimed at illicitly extracting capabilities from its Claude model to enhance their own. These labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, a clear violation of Anthropic’s terms of service and regional access restrictions.

The Technique: Distillation and its Risks

The method employed is known as “distillation,” a legitimate training technique where a less capable model learns from the outputs of a more powerful one. While commonly used to create smaller, more affordable versions of AI models, distillation can also be exploited for malicious purposes. Competitors can leverage it to acquire advanced capabilities at a fraction of the time and cost of independent development.

These campaigns are escalating in intensity and sophistication, demanding immediate and coordinated action from industry stakeholders, policymakers, and the global AI community. Illicitly distilled models lack crucial safeguards, posing significant national security risks.

National Security Implications

Anthropic and other US companies invest heavily in building systems that prevent the misuse of AI for harmful activities, such as bioweapon development or malicious cyberattacks. Models created through illicit distillation are unlikely to retain these safeguards, potentially enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. The risk is amplified if these distilled models are open-sourced, allowing capabilities to spread beyond governmental control.

These distillation attacks also undermine the effectiveness of export controls designed to maintain America’s leadership in AI. By circumventing these controls, foreign labs, including those linked to the Chinese Communist Party, can close the competitive gap through illicit means. The apparent rapid advancements of these labs are often a result of capabilities extracted from American models, requiring access to advanced chips – reinforcing the need for robust export controls.

Details of the Campaigns

DeepSeek: Over 150,000 Exchanges

DeepSeek utilized synchronized traffic across numerous accounts, employing techniques like “load balancing” to maximize throughput and evade detection. Their prompts focused on generating chain-of-thought training data and creating censorship-safe alternatives to politically sensitive queries, potentially to train their models to avoid restricted topics.

Moonshot (Kimi Models): Over 3.4 Million Exchanges

Moonshot employed hundreds of fraudulent accounts through various access pathways, making detection challenging. Request metadata linked the campaign to senior Moonshot staff, and later phases focused on reconstructing Claude’s reasoning traces.

MiniMax: Over 13 Million Exchanges

MiniMax’s campaign was attributed through request metadata and infrastructure indicators, and its timing aligned with their product roadmap. Anthropic detected this campaign while it was active, providing unprecedented insight into the distillation process, from data generation to model launch. They even pivoted their efforts within 24 hours of a new Anthropic model release to capture its capabilities.

Circumventing Access Restrictions

Due to national security concerns, Anthropic does not offer commercial access to Claude in China. Labs are circumventing this restriction by utilizing commercial proxy services that resell access to frontier AI models at scale. These services employ “hydra cluster” architectures – sprawling networks of fraudulent accounts that distribute traffic across APIs and cloud platforms, making detection difficult.

Detecting Distillation Attacks

Distillation attacks are distinguished by their patterns: massive volume concentrated in specific areas, highly repetitive structures, and content directly relevant to AI model training. Anthropic is continuously investing in defenses to make these attacks harder to execute and easier to identify.

Anthropic’s Ongoing Commitment to Security

Anthropic is committed to protecting its innovations and ensuring the responsible development of AI. Recent advancements include Claude Code Security, a new capability that scans codebases for vulnerabilities, and the release of Sonnet 4.6, delivering frontier performance across coding, agents, and professional work.

Anthropic Exposes Large-Scale AI Capability Theft by Competitors

Anthropic Exposes Large-Scale AI Capability Theft by Competitors

The Technique: Distillation and its Risks

National Security Implications

Details of the Campaigns

DeepSeek: Over 150,000 Exchanges

Moonshot (Kimi Models): Over 3.4 Million Exchanges

MiniMax: Over 13 Million Exchanges

Circumventing Access Restrictions

Detecting Distillation Attacks

Anthropic’s Ongoing Commitment to Security

Related Posts