DeepSeek V3.1 Rivals GPT-5 With 685B Parameter Model

In January 2025, DeepSeek, a Chinese AI startup, launched R1, an AI model that rivaled top-tier LLMs from OpenAI and Anthropic. Built at a fraction of the cost with fewer Nvidia chips, DeepSeek has now released V3.1, an update to its flagship V3 model, priced to undercut OpenAI, and optimized for Chinese-made chips.

DeepSeek’s V3.1 was quietly launched via a message on WeChat, a prominent Chinese messaging and social application, and on the Hugging Face platform. This development underscores several key narratives in the current AI landscape. DeepSeek’s efforts are central to China’s ambition to develop and control advanced AI systems independently of foreign technology.

The new DeepSeek V3 model is specifically optimized to perform effectively on Chinese-made chips, reflecting China’s strategic move towards technological self-reliance. While U.S. firms have shown reluctance towards adopting DeepSeek’s models, they have gained considerable traction within China and are increasingly being used in other regions globally. Some American companies have even integrated DeepSeek’s R1 reasoning model into their applications. Researchers, however, caution that the outputs from these models often closely align with narratives approved by the Chinese Communist Party, raising concerns regarding their neutrality and reliability.

China’s AI ambitions extend beyond DeepSeek, with other notable models including Alibaba’s Qwen, Moonshot AI’s Kimi, and Baidu’s Ernie. DeepSeek’s recent release, following closely after OpenAI’s GPT-5 launch, emphasizes China’s commitment to maintaining pace with, or surpassing, leading U.S. AI laboratories. The rollout of GPT-5 fell short of industry expectations, further highlighting the significance of DeepSeek’s advancements.

OpenAI CEO Sam Altman acknowledged that competition from Chinese open-source models, DeepSeek included, influenced OpenAI’s decision to release its own open-weight models. During a recent discussion with reporters, Altman stated that if OpenAI had not taken this step, the AI landscape would likely be dominated by Chinese open-source models. He emphasized that this consideration was a significant factor in their decision-making process.

The U.S. government granted Nvidia and AMD licenses to export specific AI chips to China, including Nvidia’s H20. These licenses are conditional on the companies agreeing to remit 15% of the revenue from these sales to the U.S. government. In response, Beijing has moved to restrict purchases of Nvidia chips. This followed Commerce Secretary Howard Lutnick’s statement on CNBC that the U.S. does not sell China its best, second-best, or even third-best technology.

DeepSeek’s optimization for Chinese-made chips indicates a strategic move to counter U.S. export controls and lessen dependence on Nvidia. The company stated in its WeChat announcement that the new model format is optimized for “soon-to-be-released next-generation domestic chips.”

Altman has expressed concerns that the U.S. may underestimate the complexity and significance of China’s advancements in AI. He cautioned that export controls alone might not be sufficient to address the challenges posed by China’s rapid progress. He voiced his concerns about China’s growing capabilities in the field of artificial intelligence.

The DeepSeek V3.1 model incorporates technical advancements that are primarily beneficial to developers. These innovations aim to reduce operational costs and enhance versatility compared to many closed and more expensive competing models. V3.1 has 685 billion parameters, placing it among the top “frontier” models. Its “mixture-of-experts” design activates only a fraction of the model for each query, lowering computing costs for developers. Unlike earlier DeepSeek models which separated tasks requiring instant answers from those needing step-by-step reasoning, V3.1 integrates both capabilities into a single system.

GPT-5, along with recent models from Anthropic and Google, also feature this integrated capability. However, few open-weight models have achieved this level of integration. Ben Dickson, founder of the TechTalks blog, describes V3.1’s hybrid architecture as “the biggest feature by far.”

William Falcon, founder and CEO of Lightning AI, noted that DeepSeek’s continued improvements are noteworthy, even if V3.1 is not as significant a leap as the earlier R1 model. He stated that the company continues to make “non-marginal improvements,” which is impressive. Falcon anticipates that OpenAI will respond if its open-source model begins to lag significantly. He also pointed out that the DeepSeek model is more challenging for developers to deploy into production compared to OpenAI’s version, which is relatively easy to deploy.

DeepSeek’s release highlights the increasing perception of AI as a key component of a technological competition between the U.S. and China. The fact that Chinese companies are claiming to build superior AI models at a reduced cost provides U.S. competitors with reason to carefully evaluate their strategy for maintaining leadership in the field.

Featured image credit

DeepSeek V3.1 Rivals GPT-5 With 685B Parameter Model

Stay Ahead of the Curve!

Related Posts

The AI Superfactory: NVIDIA’s Multi-Data Center ‘Scale Across’ Ethernet

Predictive Analytics in Healthcare: Improving Patient Outcomes

Leave a Reply Cancel reply