
中國人工智能初創公司深度求索(DeepSeek)在今年1月憑借一款名為R1的人工智能模型震驚世界,該模型可與OpenAI及Anthropic的頂級大語言模型(LLM)相抗衡。其研發成本僅為其他同類模型的一小部分,使用的英偉達(Nvidia)芯片數量遠少于競品,且以免費形式發布。如今,在OpenAI最新模型GPT-5發布僅兩周后,深度求索再次推出其旗艦V3模型的更新版本——專家稱該版本在部分基準測試中的表現可與GPT-5相媲美,且在定價上頗具策略性,低于GPT-5。
深度求索的新模型V3.1是在微信某用戶群及Hugging Face平臺上悄然發布的。此次發布同時觸及當前人工智能領域的多個核心議題:深度求索是中國在不依賴外國技術的前提下,推進先進人工智能系統研發、部署與管控這一整體戰略的關鍵一環。(事實上,深度求索此次推出的新版V3模型專門針對國產芯片進行了優化,以實現卓越性能。)
盡管美國企業對深度求索的模型仍持觀望態度,但這些模型已在中國廣泛應用,并在全球其他地區逐漸普及,甚至部分美國企業已基于深度求索的R1推理模型開展應用程序開發工作。
中國在人工智能領域的布局遠不止深度求索一家:國內還涌現出阿里巴巴的通義千問(Qwen)、月之暗面(Moonshot AI)的Kimi、百度的文心一言(Ernie)等模型。不過,深度求索選擇在OpenAI的GPT-5推出后不久發布新版本——后者的推出未能滿足行業觀察人士的較高預期——凸顯出中國科技界力求跟上甚至超越美國頂級實驗室的決心。
OpenAI對中國與深度求索感到擔憂
深度求索的舉措無疑讓美國實驗室倍感壓力。在近期與記者的晚宴上,OpenAI首席執行官薩姆·奧爾特曼(Sam Altman)表示,來自深度求索等中國開源模型的競爭日益激烈,這一現實狀況影響了OpenAI兩周前發布自有開源權重模型的決策。
“顯而易見,倘若我們不采取相應行動,未來全球技術生態或將主要依托中國開源模型構建,”奧爾特曼表示,“這無疑是我們決策時考慮的因素之一,雖非唯一決定要素,但其影響卻舉足輕重?!?
此外,上周美國政府發放許可證,批準英偉達和超微半導體(AMD)向中國出口專用人工智能芯片(包括英偉達的H20芯片),但前提是兩家公司同意將相關銷售收入的15%上繳美國政府。在美國商務部部長霍華德·盧特尼克(Howard Lutnick)7月15日接受美國消費者新聞與商業頻道(CNBC)采訪時稱“我們不會向中國出售最先進的芯片,也不會出售技術水平次之或處于第三梯隊的產品”后,中國政府隨即采取反制措施,著手限制英偉達芯片的采購。
通過針對國產芯片進行模型優化,深度求索既展現出應對美國出口管制的韌性,也表明其減少對英偉達依賴的決心。該公司在微信公眾號文章中指出,新模型格式已針對“即將發布的下一代國產芯片”進行優化。
在同一場晚宴上,奧爾特曼警告稱,美國可能低估了中國在人工智能領域取得的進展,并表示單靠出口管制或許并非可靠的解決方案。
雖未達成質的飛躍,卻仍是具有突破性的漸進式進展
從技術層面看,深度求索新模型的亮點在于其構建方式,其中部分技術突破對普通用戶而言并不直觀。但對開發者而言,這些創新使得V3.1相較于眾多封閉且定價高昂的競品模型更具成本優勢與通用性。
例如,V3.1規模龐大,參數數量達6850億,與眾多頂尖“前沿”模型處于同一量級。但其采用的“混合專家”架構意味著,在響應任何查詢時,僅需激活模型的一小部分,從而為開發者降低計算成本。此外,與早期深度求索模型——將“可基于預訓練數據即時回答的任務”與“需逐步推理的任務”分開處理——不同的是,V3.1在單一系統中同時實現了快速應答功能與推理功能。
GPT-5、Anthropic及谷歌的最新模型也具備類似能力,但目前能做到這一點的開源權重模型仍屈指可數??萍挤治鰩?、TechTalks博客創始人本·迪克森(Ben Dickson)向《財富》雜志表示,V3.1的混合架構“是目前為止最大的亮點”。
其他人指出,盡管這款新模型不像今年1月震驚世界的R1模型(由初代V3模型精煉而成的推理模型)那樣具有突破性,但全新的V3.1版本仍然令人矚目。人工智能開發者平臺Lightning AI的創始人兼首席執行官威廉·法爾肯(William Falcon)稱:“它們能持續實現具有實質意義的改進,這確實令人印象深刻。”不過他也補充道,倘若OpenAI的開源模型“開始出現明顯落后”,預計該公司會做出回應,并指出,深度求索的模型對開發者而言在投入生產應用時難度更大,而OpenAI的版本部署起來則相對更為便捷。
盡管技術細節繁雜,但深度求索此次新品發布凸顯了一個事實——人工智能正日益被視為中美之間暗流涌動的技術競賽的一部分。考慮到這一點,倘若中國企業能以其聲稱的一小部分成本研發出更為卓越的人工智能模型,那么美國競爭對手確實有理由擔憂自身能否保持領先地位。 (財富中文網)
譯者:中慧言-王芳
中國人工智能初創公司深度求索(DeepSeek)在今年1月憑借一款名為R1的人工智能模型震驚世界,該模型可與OpenAI及Anthropic的頂級大語言模型(LLM)相抗衡。其研發成本僅為其他同類模型的一小部分,使用的英偉達(Nvidia)芯片數量遠少于競品,且以免費形式發布。如今,在OpenAI最新模型GPT-5發布僅兩周后,深度求索再次推出其旗艦V3模型的更新版本——專家稱該版本在部分基準測試中的表現可與GPT-5相媲美,且在定價上頗具策略性,低于GPT-5。
深度求索的新模型V3.1是在微信某用戶群及Hugging Face平臺上悄然發布的。此次發布同時觸及當前人工智能領域的多個核心議題:深度求索是中國在不依賴外國技術的前提下,推進先進人工智能系統研發、部署與管控這一整體戰略的關鍵一環。(事實上,深度求索此次推出的新版V3模型專門針對國產芯片進行了優化,以實現卓越性能。)
盡管美國企業對深度求索的模型仍持觀望態度,但這些模型已在中國廣泛應用,并在全球其他地區逐漸普及,甚至部分美國企業已基于深度求索的R1推理模型開展應用程序開發工作。
中國在人工智能領域的布局遠不止深度求索一家:國內還涌現出阿里巴巴的通義千問(Qwen)、月之暗面(Moonshot AI)的Kimi、百度的文心一言(Ernie)等模型。不過,深度求索選擇在OpenAI的GPT-5推出后不久發布新版本——后者的推出未能滿足行業觀察人士的較高預期——凸顯出中國科技界力求跟上甚至超越美國頂級實驗室的決心。
OpenAI對中國與深度求索感到擔憂
深度求索的舉措無疑讓美國實驗室倍感壓力。在近期與記者的晚宴上,OpenAI首席執行官薩姆·奧爾特曼(Sam Altman)表示,來自深度求索等中國開源模型的競爭日益激烈,這一現實狀況影響了OpenAI兩周前發布自有開源權重模型的決策。
“顯而易見,倘若我們不采取相應行動,未來全球技術生態或將主要依托中國開源模型構建,”奧爾特曼表示,“這無疑是我們決策時考慮的因素之一,雖非唯一決定要素,但其影響卻舉足輕重?!?
此外,上周美國政府發放許可證,批準英偉達和超微半導體(AMD)向中國出口專用人工智能芯片(包括英偉達的H20芯片),但前提是兩家公司同意將相關銷售收入的15%上繳美國政府。在美國商務部部長霍華德·盧特尼克(Howard Lutnick)7月15日接受美國消費者新聞與商業頻道(CNBC)采訪時稱“我們不會向中國出售最先進的芯片,也不會出售技術水平次之或處于第三梯隊的產品”后,中國政府隨即采取反制措施,著手限制英偉達芯片的采購。
通過針對國產芯片進行模型優化,深度求索既展現出應對美國出口管制的韌性,也表明其減少對英偉達依賴的決心。該公司在微信公眾號文章中指出,新模型格式已針對“即將發布的下一代國產芯片”進行優化。
在同一場晚宴上,奧爾特曼警告稱,美國可能低估了中國在人工智能領域取得的進展,并表示單靠出口管制或許并非可靠的解決方案。
雖未達成質的飛躍,卻仍是具有突破性的漸進式進展
從技術層面看,深度求索新模型的亮點在于其構建方式,其中部分技術突破對普通用戶而言并不直觀。但對開發者而言,這些創新使得V3.1相較于眾多封閉且定價高昂的競品模型更具成本優勢與通用性。
例如,V3.1規模龐大,參數數量達6850億,與眾多頂尖“前沿”模型處于同一量級。但其采用的“混合專家”架構意味著,在響應任何查詢時,僅需激活模型的一小部分,從而為開發者降低計算成本。此外,與早期深度求索模型——將“可基于預訓練數據即時回答的任務”與“需逐步推理的任務”分開處理——不同的是,V3.1在單一系統中同時實現了快速應答功能與推理功能。
GPT-5、Anthropic及谷歌的最新模型也具備類似能力,但目前能做到這一點的開源權重模型仍屈指可數??萍挤治鰩?、TechTalks博客創始人本·迪克森(Ben Dickson)向《財富》雜志表示,V3.1的混合架構“是目前為止最大的亮點”。
其他人指出,盡管這款新模型不像今年1月震驚世界的R1模型(由初代V3模型精煉而成的推理模型)那樣具有突破性,但全新的V3.1版本仍然令人矚目。人工智能開發者平臺Lightning AI的創始人兼首席執行官威廉·法爾肯(William Falcon)稱:“它們能持續實現具有實質意義的改進,這確實令人印象深刻。”不過他也補充道,倘若OpenAI的開源模型“開始出現明顯落后”,預計該公司會做出回應,并指出,深度求索的模型對開發者而言在投入生產應用時難度更大,而OpenAI的版本部署起來則相對更為便捷。
盡管技術細節繁雜,但深度求索此次新品發布凸顯了一個事實——人工智能正日益被視為中美之間暗流涌動的技術競賽的一部分??紤]到這一點,倘若中國企業能以其聲稱的一小部分成本研發出更為卓越的人工智能模型,那么美國競爭對手確實有理由擔憂自身能否保持領先地位。 (財富中文網)
譯者:中慧言-王芳
Chinese AI startup DeepSeek shocked the world in January with an AI model, called R1, that rivaled OpenAI’s and Anthropic’s top LLMs. It was built at a fraction of the cost of those other models, using far fewer Nvidia chips, and was released for free. Now, just two weeks after OpenAI debuted its latest model, GPT-5, DeepSeek is back with an update to its flagship V3 model that experts say matches GPT-5 on some benchmarks—and is strategically priced to undercut it.
DeepSeek’s new V3.1 model was quietly released in a message to one of its groups on WeChat, China’s all-in-one messaging and social app, as well as on the Hugging Face platform. Its debut touches several of today’s biggest AI narratives at once. DeepSeek is a core part of China’s broader push to develop, deploy, and control advanced AI systems without relying on foreign technology. (And in fact, DeepSeek’s new V3 model is specifically tuned to perform well on Chinese-made chips.)
While U.S. companies have been hesitant to embrace DeepSeek’s models, they’ve been widely adopted in China and increasingly in other parts of the world. Even some American firms have built applications on DeepSeek’s R1 reasoning model.
China’s AI push goes beyond DeepSeek: Its industry also includes models including Alibaba’s Qwen, Moonshot AI’s Kimi, and Baidu’s Ernie. DeepSeek’s new release, however, coming just after OpenAI’s GPT-5—a rollout that fell short of industry watchers’ high expectations—underscores Beijing’s determination to keep pace with, or even leapfrog, top U.S. labs.
OpenAI is concerned about China and DeepSeek
DeepSeek’s efforts are certainly keeping U.S. labs on their toes. In a recent dinner with reporters, OpenAI CEO Sam Altman said that rising competition from Chinese open-source models, including DeepSeek, influenced his company’s decision to release its own open-weight models two weeks ago.
“It was clear that if we didn’t do it, the world was gonna be mostly built on Chinese open-source models,” Altman said. “That was a factor in our decision, for sure. Wasn’t the only one, but that loomed large.”
In addition, last week the U.S. granted Nvidia and AMD licenses to export China-specific AI chips—including Nvidia’s H20—but only if they agree to hand over 15% of revenue from those sales to Washington. Beijing quickly pushed back, moving to restrict purchases of Nvidia chips after Commerce Secretary Howard Lutnick told CNBC on July 15: “We don’t sell them our best stuff, not our second-best stuff, not even our third-best.”
By optimizing DeepSeek for Chinese-made chips, the company is signaling resilience against U.S. export controls and a drive to reduce reliance on Nvidia. In DeepSeek’s WeChat post, it noted that the new model format is optimized for “soon-to-be-released next-generation domestic chips.”
Altman, at that same dinner, warned that the U.S. may be underestimating the complexity and seriousness of China’s progress in AI—and said export controls alone likely aren’t a reliable solution.
Less of a leap, but still striking incremental advances
Technically, what makes the new DeepSeek model notable is how it was built, with a few advances that would be invisible to consumers. But for developers, these innovations make V3.1 cheaper to run and more versatile than many closed and more expensive rival models.
For instance, V3.1 is huge—685 billion parameters, which is on the level of many top “frontier” models. But its “mixture-of-experts” design means only a fraction of the model activates when answering any query, keeping computing costs lower for developers. And unlike earlier DeepSeek models that split tasks that could be answered instantly based on the model’s pretraining from those that required step-by-step reasoning, V3.1 combines both fast answers and reasoning in one system.
GPT-5, as well as the most recent models from Anthropic and Google, have a similar ability. But few open-weight models have been able to do this so far. V3.1’s hybrid architecture is “the biggest feature by far,” Ben Dickson, a tech analyst and founder of the TechTalks blog, told Fortune.
Others point out that while this DeepSeek model is less of a leap than the company’s R1 model—which was a reasoning model distilled down from the original V3 that shocked the world in January, the new V3.1 is still striking. “It is pretty impressive that they continue making non-marginal improvements,” said William Falcon, founder and CEO of AI developer platform Lightning AI. But he added that he would expect OpenAI to respond if its own open-source model “starts to meaningfully lag,” and pointed out that the DeepSeek model is harder for developers to get into production, while OpenAI’s version is fairly easy to deploy.
For all the technical details, though, DeepSeek’s latest release highlights the fact that AI is increasingly seen as part of a simmering technological cold war between the U.S. and China. With that in mind, if Chinese companies can build better AI models for what they claim is a fraction of the cost, U.S. competitors have reason to worry about staying ahead.