The field of artificial intelligence is rapidly developing, and the race for technological leadership has become one of the most important battles of the 20th century. China, striving for technological independence, is actively promoting its own solutions. However, the recent incident involving DeepSeek has dealt a serious blow to these ambitions. An attempt to train their new R2 language model on domestically produced Huawei Ascend chips ended in complete failure, forcing the company to revert to American NVIDIA processors. This incident clearly demonstrates how complex and risky the path to technological self-sufficiency can be, especially in a high-tech field like AI development.
Why did DeepSeek abandon Huawei chips?
Following the successful launch of its previous R1 model on NVIDIA chips, DeepSeek found itself under pressure from the Chinese government to use domestic hardware. The company decided to take this step in an effort to support the local industry. For months, Huawei engineers worked with the DeepSeek team to adapt the training system to Ascend chips. However, despite all efforts, the project failed. DeepSeek was unable to complete a single full R2 training cycle, which dealt a serious financial and reputational blow.
Technical problems and their consequences
The reported problems with Huawei chips were not random glitches, but rather pointed to deep systemic flaws. Among the main reasons that prevented successful training were:
- Chip instability: During long, intensive training sessions, Huawei Ascend chips experienced instability, resulting in crashes and interruptions.
- Slow connection: The data transfer speed between chips was insufficient, which created a bottleneck and significantly slowed down the model training.
- Poor software: The ecosystem and tools for developing AI on Huawei chips were not mature enough, leading to numerous compatibility issues.
- Data format issue: Huawei Ascend chips did not support the FP8 format, which is critical for optimizing the training of large language models. The company was forced to use a less efficient 16-bit format, which increased the load on the hardware and consumed more resources.
Political pressure and dependence on NVIDIA
This incident served as a clear example of how China, despite all its efforts, still relies on American technology, including NVIDIA. The NVIDIA H20 chips that DeepSeek ultimately used to train R2, while having lower throughput than their American counterparts, are extremely stable and supported by a robust software ecosystem. This event demonstrated that, in high-stakes environments, reliability and technology maturity are more important than theoretical promises.
Beijing continues to insist on using local solutions, requiring companies to prove the necessity of ordering American chips. However, the DeepSeek story clearly demonstrates that without a comprehensive and stable infrastructure capable of handling complex tasks such as training large language models, Chinese AI developers will be forced to waste time and money. This could lead to technological backwardness in a key industry.
Following the failure of training on Huawei chips, the release of the R2 model was delayed. DeepSeek reverted to NVIDIA processors for core tasks, reserving Huawei chips only for less demanding computations. This situation confirms that, on the path to technological independence, simply creating hardware is not enough. Its reliability, software support, and compliance with modern needs are equally important.
0 Comments