Back to News
Market Impact: 0.5

DeepSeek claims training R1 model cost less than $300,000; acknowledges A100 use

NVDATSLAINTC
Artificial IntelligenceTechnology & InnovationSanctions & Export ControlsCompany FundamentalsProduct LaunchesGeopolitics & War
DeepSeek claims training R1 model cost less than $300,000; acknowledges A100 use

Chinese AI startup DeepSeek claims its R1 model was trained for a significantly low cost of $294,000 using 512 Nvidia H800 GPUs, a figure that has faced scrutiny amid suggestions of access to more advanced, export-controlled H100 chips. DeepSeek subsequently admitted to utilizing Nvidia A100 GPUs for preparatory development stages, while highlighting R1's innovative use of reinforcement learning to achieve superior reasoning in large language models. This situation underscores the contentious landscape of AI development costs, resource acquisition under U.S. export controls, and the potential for breakthroughs in efficient AI training methodologies.

Analysis

Chinese AI startup DeepSeek has claimed a remarkably low training cost of $294,000 for its reasoning-focused R1 model, reportedly using just over 500 Nvidia (NVDA) H800 GPUs. This assertion has been met with significant skepticism, notably from figures like Elon Musk, who amplified speculation that DeepSeek may have undisclosed access to a much larger and more powerful arsenal of around 50,000 H100 chips, which are subject to U.S. export controls. The market initially reacted to these developments in January with a sharp, albeit temporary, decline in AI-related stocks, including Nvidia, before they recouped their losses. Adding complexity to the situation, DeepSeek has since admitted in a supplementary paper to using Nvidia's A100 GPUs during preparatory development stages. The core technological claim is that its R1 model leverages pure reinforcement learning (RL) to enhance reasoning, thereby bypassing the need for human-labeled data and achieving superior performance on tasks in mathematics and coding. This case highlights the intense competition and opacity surrounding AI development costs, the critical role of high-end GPUs, and the geopolitical tensions influencing access to critical hardware.

AllMind AI Terminal

AI-powered research, real-time alerts, and portfolio analytics for institutional investors.