日韩中文字幕在线一区二区三区,亚洲热视频在线观看,久久精品午夜一区二区福利,精品一区二区三区在线观看l,麻花传媒剧电影,亚洲香蕉伊综合在人在线,免费av一区二区三区在线,亚洲成在线人视频观看
          首頁 500強(qiáng) 活動(dòng) 榜單 商業(yè) 科技 商潮 專題 品牌中心
          雜志訂閱

          “按需思考”的GPT-5引發(fā)爭議,但這可能是AI的未來

          Sharon Goldman
          2025-08-17

          OpenAI首席執(zhí)行官薩姆·奧爾特曼將模型路由器宣傳為解決模型選擇難題的方案。

          文本設(shè)置
          小號
          默認(rèn)
          大號
          Plus(0條)

          圖片來源:Chris Jung/NurPhoto via Getty Images

          OpenAI上周發(fā)布的GPT-5本應(yīng)是一場勝利,證明該公司仍是AI領(lǐng)域無可爭議的領(lǐng)導(dǎo)者,然而結(jié)果卻事與愿違。上周末,用戶的強(qiáng)烈反對使此次發(fā)布不僅演變成公關(guān)危機(jī),更升級為產(chǎn)品與信任危機(jī)。用戶痛惜失去他們最喜歡的、可兼任心理醫(yī)生、朋友和伴侶的模型,開發(fā)者則抱怨模型的性能下降。行業(yè)評論家蓋瑞·馬庫斯照例批評GPT-5“姍姍來遲、過度炒作、索然無味”。

          許多人指出問題的根源是顯而易見的:全新的實(shí)時(shí)模型“路由器”會為每項(xiàng)任務(wù)自動(dòng)調(diào)度GPT-5的某個(gè)子版本。許多用戶原以為GPT-5是從零訓(xùn)練的單體模型;實(shí)際上,它卻是多個(gè)模型的組合網(wǎng)絡(luò),有些模型性能較弱、成本更低,有些模型能力更強(qiáng)但成本更高。專家表示,隨著大語言模型的發(fā)展且日益消耗資源,這種架構(gòu)可能代表了AI的未來方向。但在GPT-5的首秀中,OpenAI暴露出該架構(gòu)存在的一些固有挑戰(zhàn),也深刻認(rèn)識到AI時(shí)代用戶期望的演變趨勢。

          盡管模型路由技術(shù)有眾多優(yōu)勢,但廣大GPT-5用戶仍對其剝奪控制權(quán)感到憤怒。有人甚至質(zhì)疑OpenAI可能試圖故意蒙蔽用戶。

          為平息風(fēng)波,OpenAI迅速為專業(yè)用戶重新啟用早期主力模型GPT-4o,同時(shí)宣布修復(fù)路由故障、提高使用限額,并承諾持續(xù)更新以重建用戶信任與系統(tǒng)穩(wěn)定性。

          對于這種情況,AI銷售平臺FirstQuadrant聯(lián)合創(chuàng)始人阿南德·喬杜里直言不諱地評價(jià)道:“當(dāng)路由精準(zhǔn)時(shí),它像魔法一樣神奇,但當(dāng)它失靈時(shí),卻如同系統(tǒng)崩潰一般。”

          模型路由技術(shù)的未來前景與不一致性

          伊利諾伊大學(xué)厄巴納-香檳分校(University of Illinois Urbana-Champaign)計(jì)算機(jī)科學(xué)助理教授游家軒(音譯)向《財(cái)富》透露,其實(shí)驗(yàn)室深入研究了模型路由技術(shù)的未來前景與不一致性。他表示,就GPT-5而言,他相信(雖并未證實(shí))模型路由器有時(shí)可能將同一查詢的不同部分分發(fā)至多個(gè)模型:更廉價(jià)快速的模型給出一種答案,而響應(yīng)速度較慢、專注于推理的模型產(chǎn)生另一結(jié)果,當(dāng)系統(tǒng)拼接不同模型的回應(yīng)時(shí)會出現(xiàn)細(xì)微的矛盾之處。

          他解釋道,模型路由的構(gòu)想雖然直觀,“但真正讓它發(fā)揮作用卻并不容易”。他補(bǔ)充道,完善路由系統(tǒng)的難度堪比打造亞馬遜(Amazon)級別的推薦系統(tǒng),需要耗費(fèi)數(shù)年,并且與眾多領(lǐng)域?qū)<覅f(xié)作。他解釋道:“構(gòu)建GPT-5模型投入的資源本應(yīng)呈指數(shù)級增長。即便路由器選擇小型模型,也不該產(chǎn)生不一致的答案。”

          不過游家軒堅(jiān)信路由技術(shù)將成常態(tài)。他表示:“業(yè)內(nèi)同樣認(rèn)可模型路由技術(shù)的前景?!彼赋鲞@源于技術(shù)與經(jīng)濟(jì)的雙重考量。在技術(shù)層面,單體模型性能似乎觸及瓶頸。他提到了廣受認(rèn)可的擴(kuò)展定律,即數(shù)據(jù)與算力增長可提升模型性能。他表示:“但眾所周知,模型改進(jìn)存在極限。過去一年我們親眼見證單體模型的能力趨于飽和?!?/p>

          在經(jīng)濟(jì)層面,路由技術(shù)使AI供應(yīng)商能夠重復(fù)使用舊模型,而不是在新模型發(fā)布后將其棄用。時(shí)事類查詢需頻繁更新,但靜態(tài)事實(shí)在多年之后依舊準(zhǔn)確。將特定查詢導(dǎo)向舊模型,可避免浪費(fèi)先前為訓(xùn)練模型投入的大量時(shí)間、算力和資金。

          物理限制同樣關(guān)鍵。GPU內(nèi)存已成為訓(xùn)練更大模型的瓶頸,而芯片技術(shù)正逼近單晶片可承載的存儲極限。游家軒解釋稱,物理限制意味著新模型的規(guī)模無法擴(kuò)大十倍。

          重獲關(guān)注的舊理念

          AI平臺Lightning AI創(chuàng)始人兼CEO威廉·法爾肯指出,模型集成并非新概念,而是在2018年左右就已出現(xiàn),由于OpenAI模型屬黑箱系統(tǒng),我們無法得知GPT-4是否也采用了模型路由技術(shù)。

          他表示:“或許他們現(xiàn)在更明確地公開了這一點(diǎn)?!睙o論如何,GPT-5的發(fā)布被過度炒作——包括其模型路由系統(tǒng)。介紹該模型的官方博文宣稱這是“迄今為止最智能、最快速、最實(shí)用的內(nèi)置思維模型”。OpenAI在ChatGPT的官方博客中證實(shí),GPT-5通過后臺路由器協(xié)調(diào)多模型運(yùn)行,必要時(shí)切換至深度推理模式。GPT-5系統(tǒng)文檔更進(jìn)一步列明多個(gè)變體:標(biāo)準(zhǔn)版gpt-5-main、高速版gpt-5-main-mini、思維版gpt-5-thinking、精簡思維版gpt-5-thinking-mini及專業(yè)思考版,并闡述統(tǒng)一系統(tǒng)如何自動(dòng)調(diào)度。

          在媒體預(yù)發(fā)布會上,OpenAI CEO薩姆·奧爾特曼將模型路由器譽(yù)為解決“模型選擇難題”的方案。他表示舊版模型選擇界面是“一團(tuán)糟,令人迷惑”。

          但法爾肯認(rèn)為,核心問題在于GPT-5未帶來跨越式提升?!皬腉PT-1到2、3、4,每次迭代都有巨大飛躍。而第四代到第五代的改進(jìn)微乎其微,這才是用戶不滿的根源?!?/p>

          多模型疊加能否實(shí)現(xiàn)AGI?

          關(guān)于模型路由的爭議引發(fā)部分人士批評當(dāng)前對通用人工智能(AGI)即將實(shí)現(xiàn)的過度炒作。OpenAI官方將AGI定義為“在大多數(shù)具有經(jīng)濟(jì)價(jià)值的工作中超越人類的高度自主系統(tǒng)”,但奧爾特曼上周特別強(qiáng)調(diào)該術(shù)語“實(shí)用性不足”。

          TensorOpera聯(lián)合創(chuàng)始人、AI研究員何朝陽在X平臺發(fā)文批評GPT-5的發(fā)布稱:“承諾的AGI在哪里?強(qiáng)大如OpenAI這樣的公司也無力訓(xùn)練超大模型,被迫采用實(shí)時(shí)模型路由器?!?/p>

          AI生產(chǎn)平臺Anyscale的聯(lián)合創(chuàng)始人羅伯特·西哈拉表示,AI領(lǐng)域仍在持續(xù)擴(kuò)展,但全能型單體模型仍遙不可及。他表示:“很難打造出樣樣精通的全能模型?!边@正是GPT-5依賴路由連接的模型網(wǎng)絡(luò)而非單體架構(gòu)的原因。

          OpenAI曾表示希望未來整合為單一模型,但西哈拉強(qiáng)調(diào)混合系統(tǒng)具備實(shí)質(zhì)優(yōu)勢:你可以逐步升級系統(tǒng)中的某個(gè)部分,不會影響其他部分的運(yùn)行;這樣既能獲得大部分性能提升,又能避免重新訓(xùn)練整個(gè)龐大模型所帶來的高昂成本和復(fù)雜性。因此他認(rèn)為路由技術(shù)將長期存在。

          何朝陽對此表示認(rèn)同。理論上擴(kuò)展定律依然成立,即更多數(shù)據(jù)與算力能提升模型性能,但在實(shí)際操作中,他認(rèn)為AI的發(fā)展會在兩種路徑之間“螺旋式推進(jìn)”:一方面是將多個(gè)專用模型通過路由機(jī)制組合使用,另一方面則是嘗試將它們整合成一個(gè)統(tǒng)一的大模型。決定因素在于工程成本、算力與能源限制,以及商業(yè)壓力。

          對AGI的過度炒作也需要調(diào)整。法爾肯在談及大語言模型的“大腦”時(shí)表示:“如果真有人做出接近AGI的東西,我不確定那是否會由一組權(quán)重參數(shù)來實(shí)現(xiàn)。如果是一組模型組合起來,整體看起來像是AGI,那也沒問題。我們在這方面不要拘泥于純粹主義。”(財(cái)富中文網(wǎng))

          譯者:劉進(jìn)龍

          審校:汪皓

          OpenAI上周發(fā)布的GPT-5本應(yīng)是一場勝利,證明該公司仍是AI領(lǐng)域無可爭議的領(lǐng)導(dǎo)者,然而結(jié)果卻事與愿違。上周末,用戶的強(qiáng)烈反對使此次發(fā)布不僅演變成公關(guān)危機(jī),更升級為產(chǎn)品與信任危機(jī)。用戶痛惜失去他們最喜歡的、可兼任心理醫(yī)生、朋友和伴侶的模型,開發(fā)者則抱怨模型的性能下降。行業(yè)評論家蓋瑞·馬庫斯照例批評GPT-5“姍姍來遲、過度炒作、索然無味”。

          許多人指出問題的根源是顯而易見的:全新的實(shí)時(shí)模型“路由器”會為每項(xiàng)任務(wù)自動(dòng)調(diào)度GPT-5的某個(gè)子版本。許多用戶原以為GPT-5是從零訓(xùn)練的單體模型;實(shí)際上,它卻是多個(gè)模型的組合網(wǎng)絡(luò),有些模型性能較弱、成本更低,有些模型能力更強(qiáng)但成本更高。專家表示,隨著大語言模型的發(fā)展且日益消耗資源,這種架構(gòu)可能代表了AI的未來方向。但在GPT-5的首秀中,OpenAI暴露出該架構(gòu)存在的一些固有挑戰(zhàn),也深刻認(rèn)識到AI時(shí)代用戶期望的演變趨勢。

          盡管模型路由技術(shù)有眾多優(yōu)勢,但廣大GPT-5用戶仍對其剝奪控制權(quán)感到憤怒。有人甚至質(zhì)疑OpenAI可能試圖故意蒙蔽用戶。

          為平息風(fēng)波,OpenAI迅速為專業(yè)用戶重新啟用早期主力模型GPT-4o,同時(shí)宣布修復(fù)路由故障、提高使用限額,并承諾持續(xù)更新以重建用戶信任與系統(tǒng)穩(wěn)定性。

          對于這種情況,AI銷售平臺FirstQuadrant聯(lián)合創(chuàng)始人阿南德·喬杜里直言不諱地評價(jià)道:“當(dāng)路由精準(zhǔn)時(shí),它像魔法一樣神奇,但當(dāng)它失靈時(shí),卻如同系統(tǒng)崩潰一般?!?/p>

          模型路由技術(shù)的未來前景與不一致性

          伊利諾伊大學(xué)厄巴納-香檳分校(University of Illinois Urbana-Champaign)計(jì)算機(jī)科學(xué)助理教授游家軒(音譯)向《財(cái)富》透露,其實(shí)驗(yàn)室深入研究了模型路由技術(shù)的未來前景與不一致性。他表示,就GPT-5而言,他相信(雖并未證實(shí))模型路由器有時(shí)可能將同一查詢的不同部分分發(fā)至多個(gè)模型:更廉價(jià)快速的模型給出一種答案,而響應(yīng)速度較慢、專注于推理的模型產(chǎn)生另一結(jié)果,當(dāng)系統(tǒng)拼接不同模型的回應(yīng)時(shí)會出現(xiàn)細(xì)微的矛盾之處。

          他解釋道,模型路由的構(gòu)想雖然直觀,“但真正讓它發(fā)揮作用卻并不容易”。他補(bǔ)充道,完善路由系統(tǒng)的難度堪比打造亞馬遜(Amazon)級別的推薦系統(tǒng),需要耗費(fèi)數(shù)年,并且與眾多領(lǐng)域?qū)<覅f(xié)作。他解釋道:“構(gòu)建GPT-5模型投入的資源本應(yīng)呈指數(shù)級增長。即便路由器選擇小型模型,也不該產(chǎn)生不一致的答案?!?/p>

          不過游家軒堅(jiān)信路由技術(shù)將成常態(tài)。他表示:“業(yè)內(nèi)同樣認(rèn)可模型路由技術(shù)的前景。”他指出這源于技術(shù)與經(jīng)濟(jì)的雙重考量。在技術(shù)層面,單體模型性能似乎觸及瓶頸。他提到了廣受認(rèn)可的擴(kuò)展定律,即數(shù)據(jù)與算力增長可提升模型性能。他表示:“但眾所周知,模型改進(jìn)存在極限。過去一年我們親眼見證單體模型的能力趨于飽和?!?/p>

          在經(jīng)濟(jì)層面,路由技術(shù)使AI供應(yīng)商能夠重復(fù)使用舊模型,而不是在新模型發(fā)布后將其棄用。時(shí)事類查詢需頻繁更新,但靜態(tài)事實(shí)在多年之后依舊準(zhǔn)確。將特定查詢導(dǎo)向舊模型,可避免浪費(fèi)先前為訓(xùn)練模型投入的大量時(shí)間、算力和資金。

          物理限制同樣關(guān)鍵。GPU內(nèi)存已成為訓(xùn)練更大模型的瓶頸,而芯片技術(shù)正逼近單晶片可承載的存儲極限。游家軒解釋稱,物理限制意味著新模型的規(guī)模無法擴(kuò)大十倍。

          重獲關(guān)注的舊理念

          AI平臺Lightning AI創(chuàng)始人兼CEO威廉·法爾肯指出,模型集成并非新概念,而是在2018年左右就已出現(xiàn),由于OpenAI模型屬黑箱系統(tǒng),我們無法得知GPT-4是否也采用了模型路由技術(shù)。

          他表示:“或許他們現(xiàn)在更明確地公開了這一點(diǎn)?!睙o論如何,GPT-5的發(fā)布被過度炒作——包括其模型路由系統(tǒng)。介紹該模型的官方博文宣稱這是“迄今為止最智能、最快速、最實(shí)用的內(nèi)置思維模型”。OpenAI在ChatGPT的官方博客中證實(shí),GPT-5通過后臺路由器協(xié)調(diào)多模型運(yùn)行,必要時(shí)切換至深度推理模式。GPT-5系統(tǒng)文檔更進(jìn)一步列明多個(gè)變體:標(biāo)準(zhǔn)版gpt-5-main、高速版gpt-5-main-mini、思維版gpt-5-thinking、精簡思維版gpt-5-thinking-mini及專業(yè)思考版,并闡述統(tǒng)一系統(tǒng)如何自動(dòng)調(diào)度。

          在媒體預(yù)發(fā)布會上,OpenAI CEO薩姆·奧爾特曼將模型路由器譽(yù)為解決“模型選擇難題”的方案。他表示舊版模型選擇界面是“一團(tuán)糟,令人迷惑”。

          但法爾肯認(rèn)為,核心問題在于GPT-5未帶來跨越式提升?!皬腉PT-1到2、3、4,每次迭代都有巨大飛躍。而第四代到第五代的改進(jìn)微乎其微,這才是用戶不滿的根源?!?/p>

          多模型疊加能否實(shí)現(xiàn)AGI?

          關(guān)于模型路由的爭議引發(fā)部分人士批評當(dāng)前對通用人工智能(AGI)即將實(shí)現(xiàn)的過度炒作。OpenAI官方將AGI定義為“在大多數(shù)具有經(jīng)濟(jì)價(jià)值的工作中超越人類的高度自主系統(tǒng)”,但奧爾特曼上周特別強(qiáng)調(diào)該術(shù)語“實(shí)用性不足”。

          TensorOpera聯(lián)合創(chuàng)始人、AI研究員何朝陽在X平臺發(fā)文批評GPT-5的發(fā)布稱:“承諾的AGI在哪里?強(qiáng)大如OpenAI這樣的公司也無力訓(xùn)練超大模型,被迫采用實(shí)時(shí)模型路由器?!?/p>

          AI生產(chǎn)平臺Anyscale的聯(lián)合創(chuàng)始人羅伯特·西哈拉表示,AI領(lǐng)域仍在持續(xù)擴(kuò)展,但全能型單體模型仍遙不可及。他表示:“很難打造出樣樣精通的全能模型?!边@正是GPT-5依賴路由連接的模型網(wǎng)絡(luò)而非單體架構(gòu)的原因。

          OpenAI曾表示希望未來整合為單一模型,但西哈拉強(qiáng)調(diào)混合系統(tǒng)具備實(shí)質(zhì)優(yōu)勢:你可以逐步升級系統(tǒng)中的某個(gè)部分,不會影響其他部分的運(yùn)行;這樣既能獲得大部分性能提升,又能避免重新訓(xùn)練整個(gè)龐大模型所帶來的高昂成本和復(fù)雜性。因此他認(rèn)為路由技術(shù)將長期存在。

          何朝陽對此表示認(rèn)同。理論上擴(kuò)展定律依然成立,即更多數(shù)據(jù)與算力能提升模型性能,但在實(shí)際操作中,他認(rèn)為AI的發(fā)展會在兩種路徑之間“螺旋式推進(jìn)”:一方面是將多個(gè)專用模型通過路由機(jī)制組合使用,另一方面則是嘗試將它們整合成一個(gè)統(tǒng)一的大模型。決定因素在于工程成本、算力與能源限制,以及商業(yè)壓力。

          對AGI的過度炒作也需要調(diào)整。法爾肯在談及大語言模型的“大腦”時(shí)表示:“如果真有人做出接近AGI的東西,我不確定那是否會由一組權(quán)重參數(shù)來實(shí)現(xiàn)。如果是一組模型組合起來,整體看起來像是AGI,那也沒問題。我們在這方面不要拘泥于純粹主義。”(財(cái)富中文網(wǎng))

          譯者:劉進(jìn)龍

          審校:汪皓

          OpenAI’s GPT-5 announcement last week was meant to be a triumph—proof that the company was still the undisputed leader in AI—until it wasn’t. Over the weekend, a groundswell of pushback from customers turned the rollout into more than a PR firestorm: It became a product and trust crisis. Users lamented the loss of their favorite models, which had doubled as therapists, friends, and romantic partners. Developers complained of degraded performance. Industry critic Gary Marcus predictably called GPT-5 “overdue, overhyped, and underwhelming.”

          The culprit, many argued, was hiding in plain sight: a new real-time model “router” that automatically decides which one of GPT-5’s several variants to spin up for every job. Many users assumed GPT-5 was a single model trained from scratch; in reality, it’s a network of models—some weaker and cheaper, others stronger and more expensive—stitched together. Experts say that approach could be the future of AI as large language models advance and become more resource-intensive. But in GPT-5’s debut, OpenAI demonstrated some of the inherent challenges in the approach and learned some important lessons about how user expectations are evolving in the AI era.

          For all the benefits promised by model routing, many users of GPT-5 bristled at what they perceived as a lack of control. Some even suggested OpenAI might purposefully be trying to pull the wool over their eyes.

          In response to the GPT-5 uproar, OpenAI moved quickly to bring back the main earlier model, GPT-4o, for pro users. It also said it fixed buggy routing, increased usage limits, and promised continual updates to regain user trust and stability.

          Anand Chowdhary, cofounder of AI sales platform FirstQuadrant, summed the situation up bluntly: “When routing hits, it feels like magic. When it whiffs, it feels broken.”

          The promise and inconsistency of model routing

          Jiaxuan You, an assistant professor of computer science at the University of Illinois Urbana-Champaign, told Fortune his lab has studied both the promise—and the inconsistency—of model routing. In GPT-5’s case, he said, he believes (though he can’t confirm) that the model router sometimes sends parts of the same query to different models. A cheaper, faster model might give one answer while a slower, reasoning-focused model gives another, and when the system stitches those responses together, subtle contradictions slip through.

          The model routing idea is intuitive, he explained, but “making it really work is very nontrivial.” Perfecting a router, he added, can be as challenging as building Amazon-grade recommendation systems, which take years and many domain experts to refine. “GPT-5 is supposed to be built with maybe orders of magnitude more resources,” he explained, pointing out that even if the router picks a smaller model, it shouldn’t produce inconsistent answers.

          Still, You believes routing is here to stay. “The community also believes model routing is promising,” he said, pointing to both technical and economic reasons. Technically, single-model performance appears to be hitting a plateau: You pointed to the commonly cited scaling laws, which says when we have more data and compute, the model gets better. “But we all know that the model wouldn’t get infinitely better,” he said. “Over the past year, we have all witnessed that the capacity of a single model is actually saturating.”

          Economically, routing lets AI providers keep using older models rather than discarding them when a new one launches. Current events require frequent updates, but static facts remain accurate for years. Directing certain queries to older models avoids wasting the enormous time, compute, and money already spent on training them.

          There are hard physical limits, too. GPU memory has become a bottleneck for training ever-larger models, and chip technology is approaching the maximum memory that can be packed onto a single die. In practice, You explained, physical limits mean the next model can’t be 10 times bigger.

          An older idea that is now being hyped

          William Falcon, founder and CEO of AI platform Lightning AI, points out that the idea of using an ensemble of models is not new—it has been around since around 2018—and since OpenAI’s models are a black box, we don’t know that GPT-4 did not also use a model routing system.

          “I think maybe they’re being more explicit about it now, potentially,” he said. Either way, the GPT-5 launch was heavily hyped up—including the model routing system. The blog post introducing the model called it the “smartest, fastest, and most useful model yet, with thinking built in.” In the official ChatGPT blog post, OpenAI confirmed that GPT 5 within ChatGPT runs on a system of models coordinated by a behind-the-scenes router that switches to deeper reasoning when needed. The GPT 5 System Card went further, clearly outlining multiple model variants—gpt 5 main, gpt 5 main mini for speed, and gpt 5 thinking, gpt 5 thinking mini, plus a thinking pro version—and explains how the unified system automatically routes between them.

          In a press pre-briefing, OpenAI CEO Sam Altman touted the model router as a way to tackle what had been a hard-to-decipher list of models to choose from. Altman called the previous model picker interface a “very confusing mess.”

          But Falcon said the core problem was that GPT-5 simply didn’t feel like a leap. “GPT-1 to 2 to 3 to 4—each time was a massive jump. Four to five was not noticeably better. That’s what people are upset about.”

          Will multiple models add up to AGI?

          The debate over model routing led some to call out the ongoing hype over the possibility of artificial general intelligence, or AGI, being developed soon. OpenAI officially defines AGI as “highly autonomous systems that outperform humans at most economically valuable work,” but Altman notably said last week that it is “not a super useful term.”

          “What about the promised AGI?” wrote Aiden Chaoyang He, an AI researcher and cofounder of TensorOpera, on X, criticizing the GPT-5 rollout. “Even a powerful company like OpenAI lacks the ability to train a super-large model, forcing them to resort to the Real-time Model Router.”

          Robert Nishihara, co-founder of AI production platform Anyscale, says scaling is still progressing in AI, but the idea of one all-powerful AI model remains elusive. “It’s hard to build one model that is the best at everything,” he said. That’s why GPT-5 currently runs on a network of models linked by a router, not a single monolith.

          OpenAI has said it hopes to unify these into one model in the future, but Nishihara points out that hybrid systems have real advantages: You can upgrade one piece at a time without disrupting the rest, and you get most of the benefits without the cost and complexity of retraining an entire giant model. As a result, Nishihara thinks routing will stick around.

          Aiden Chaoyang He agrees. In theory, scaling laws still hold—more data and compute make models better—but in practice, he believes development will “spiral” between two approaches: routing specialized models together, then trying to consolidate them into one. The deciding factors will be engineering costs, compute and energy limits, and business pressures.

          The hyped-up AGI narrative may need to adjust, too. “If anyone does anything that’s close to AGI, I don’t know if it’ll literally be one set of weights doing it,” Falcon said, referring to the “brains” behind LLMs. “If it’s a collection of models that feels like AGI, that’s fine. No one’s a purist here.”

          財(cái)富中文網(wǎng)所刊載內(nèi)容之知識產(chǎn)權(quán)為財(cái)富媒體知識產(chǎn)權(quán)有限公司及/或相關(guān)權(quán)利人專屬所有或持有。未經(jīng)許可,禁止進(jìn)行轉(zhuǎn)載、摘編、復(fù)制及建立鏡像等任何使用。
          0條Plus
          精彩評論
          評論

          撰寫或查看更多評論

          請打開財(cái)富Plus APP

          前往打開