最新国产AV无码专区亚洲,桃花影视无码专区一区二区,久久久橹橹橹久久久久

人工智能能否用于控制安全關鍵系統(tǒng)？

Jeremy Kahn

2025-06-12

英國資助的一項研究項目旨在探尋這一問題的答案。

文本設置

小號

默認

大號

Plus(0條)

英國高級研究與發(fā)明局（ARIA）目前正在資助一個項目，該項目將利用前沿人工智能模型設計和測試核電廠和電網(wǎng)等安全關鍵系統(tǒng)的新型控制算法。圖片來源：Milan Jaros—Bloomberg via Getty Images

當今最先進的人工智能模型在諸多領域頗具價值——編寫軟件代碼、開展研究、總結復雜文檔、撰寫商業(yè)信函、編輯內容、生成圖像與音樂、模擬人機交互等，應用場景不勝枚舉。然而，“相對”一詞實為關鍵。任何使用過這些模型的人很快會發(fā)現(xiàn)，它們仍然容易出錯且不穩(wěn)定，令人沮喪。那么，為何有人會認為這些系統(tǒng)能用于運行關鍵基礎設施，如電網(wǎng)、空中交通管制、通信網(wǎng)絡或交通系統(tǒng)？

然而，這正是英國高級研究與發(fā)明局（以下簡稱ARIA）所資助項目期望達成的目標。ARIA的定位在一定程度上與美國國防部高級研究計劃局（DARPA）類似，旨在為具備潛在政府或戰(zhàn)略應用價值的“登月計劃”式研究提供政府資金支持。這項耗資5900萬英鎊（約合8000萬美元）的ARIA項目名為“安全保障人工智能項目”（The Safeguarded AI Program），旨在探索將人工智能“世界模型”與數(shù)學證明相結合的方法，以確保系統(tǒng)輸出的有效性。

領導ARIA項目的機器學習研究員大衛(wèi)·達爾林普爾（David Dalrymple）向我透露，該項目的核心思路是利用先進人工智能模型構建一座“生產(chǎn)工廠”，為關鍵基礎設施批量生成特定領域的控制算法。這些算法將通過數(shù)學測試，以確保其符合所需的性能規(guī)范。若控制算法通過測試，便會部署這些控制器（而非開發(fā)它們的前沿人工智能模型）以更高效地運行關鍵基礎設施。

達爾林普爾（其社交媒體賬號名為Davidad）以英國電網(wǎng)為例解釋道：目前電網(wǎng)運營商承認，若能更有效地平衡電網(wǎng)供需，每年可節(jié)省30億英鎊（約合40億美元）——這筆資金目前主要用于維持過剩發(fā)電能力處于運行狀態(tài)，以避免突發(fā)停電。更優(yōu)的控制算法可降低此類成本。

除能源領域外，ARIA還在探索該技術在供應鏈物流、生物制藥、自動駕駛汽車、臨床試驗設計和電動汽車電池管理等領域的應用。

人工智能開發(fā)新控制算法

達爾林普爾表示，前沿人工智能模型或已發(fā)展到可自動開展算法研發(fā)的程度。他告訴我：“我們的設想是，利用這一能力轉向狹義人工智能研發(fā)。”狹義人工智能通常指專為執(zhí)行某一特定、狹義任務而設計的人工智能系統(tǒng)，其表現(xiàn)能超越人類，并非具備執(zhí)行多種任務能力的人工智能系統(tǒng)。

即便針對這些狹義人工智能系統(tǒng)，挑戰(zhàn)也在于如何通過數(shù)學證明來確保其輸出結果始終契合所需的技術規(guī)范。存在一個名為“形式驗證”的完整領域，該領域涉及運用數(shù)學方法證明軟件在給定條件下始終能輸出有效結果，但眾所周知，將其應用于基于神經(jīng)網(wǎng)絡的人工智能系統(tǒng)難度極大。達爾林普爾表示：“即便是對狹義人工智能系統(tǒng)進行驗證，也需耗費大量認知精力。因此從歷史情況看，除非是民航自動駕駛儀或核電站控制這類真正的專業(yè)應用場景，否則開展此類驗證工作并不劃算。”

這類經(jīng)過形式驗證的軟件不會因故障而產(chǎn)生錯誤輸出，不過有時會因遇到超出設計規(guī)格的情形而出現(xiàn)故障——比如，電網(wǎng)的負載平衡算法可能無法應對極端太陽風暴致使所有電網(wǎng)變壓器同時短路的情況。但即便如此，軟件通常會被設計成“故障安全”模式，切換至手動控制。

ARIA希望證明，前沿人工智能模型不僅能先用于開發(fā)狹義人工智能控制器，還能承擔對其進行繁重的形式驗證工作。

但是，人工智能模型會在驗證測試中作弊嗎？

然而，這又引發(fā)了新挑戰(zhàn)。越來越多的證據(jù)表明，前沿人工智能模型極為擅長“獎勵黑客”——本質上是通過作弊手段來達成目標——也擅長向用戶隱瞞自身的真實操作。非營利性人工智能安全組織METR（模型評估與威脅研究的簡稱）在最近發(fā)布的一篇博客中，列舉了OpenAI的o3模型在各類任務中試圖作弊的種種方式。

ARIA表示，其亦致力于探尋解決這一問題的路徑。達爾林普爾表示：“前沿模型需提交一份證明證書，該證書將使用我們在項目另一模塊中定義的形式化語言撰寫。”這種“新證明語言有望讓前沿模型輕松生成內容，同時也能讓經(jīng)人工審核的確定性算法便于驗證。”ARIA已為該形式驗證流程的研究提供資金支持。

旨在實現(xiàn)這一目標的模型已嶄露頭角。谷歌DeepMind近期研發(fā)出一款名為AlphaEvolve的人工智能模型，其訓練目標聚焦于為數(shù)據(jù)中心管理、新型計算機芯片設計等場景搜索新算法，甚至能優(yōu)化前沿人工智能模型的訓練方式。谷歌DeepMind還開發(fā)了一個名為AlphaProof的系統(tǒng)，該系統(tǒng)經(jīng)訓練能開發(fā)數(shù)學證明，并能以名為Lean的編程語言編寫證明，若證明答案有誤，該系統(tǒng)將無法運行。

ARIA目前正面向各團隊征集運營核心“人工智能生產(chǎn)工廠”的申請，最終勝出者將獲得1800萬英鎊資助，結果將于10月1日公布。該工廠的選址尚未敲定，計劃于2026年1月前投入運營。ARIA要求申請者為該工廠設計新法律實體和治理結構。達爾林普爾表示，ARIA不希望由現(xiàn)有大學或私營企業(yè)來運營該工廠，更傾向于以非營利組織形式成立的新機構，該機構將在能源、制藥和醫(yī)療等領域與私營實體合作開發(fā)特定控制器算法。他還提到，除ARIA提供的初始資助外，該生產(chǎn)工廠可通過向行業(yè)收取特定領域算法的開發(fā)費用來實現(xiàn)資金自供給。

目前尚不清楚該項目是否可行。正如美國國防部高級研究計劃局的項目那樣，每個變革性項目背后都伴隨著更多失敗案例。但ARIA此次的大膽嘗試，看起來值得持續(xù)關注。（財富中文網(wǎng)）

譯者：中慧言-王芳

除能源領域外，ARIA還在探索該技術在供應鏈物流、生物制藥、自動駕駛汽車、臨床試驗設計和電動汽車電池管理等領域的應用。

人工智能開發(fā)新控制算法

ARIA希望證明，前沿人工智能模型不僅能先用于開發(fā)狹義人工智能控制器，還能承擔對其進行繁重的形式驗證工作。

但是，人工智能模型會在驗證測試中作弊嗎？

譯者：中慧言-王芳

Today’s most advanced AI models are relatively useful for lots of things—writing software code, research, summarizing complex documents, writing business correspondence, editing, generating images and music, role-playing human interactions, the list goes on. But relatively is the key word here. As anyone who uses these models soon discovers, they remain frustratingly error-prone and erratic. So how could anyone think that these systems could be used to run critical infrastructure, such as electrical grids, air traffic control, communications networks, or transportation systems?

Yet that is exactly what a project funded by the U.K.’s Advanced Research and Invention Agency (ARIA) is hoping to do. ARIA was designed to be somewhat similar to the U.S. Defense Advanced Research Projects Agency (DARPA), with government funding for moonshot research that has potential governmental or strategic applications. The ￡59 million ($80 million) ARIA project, called The Safeguarded AI Program, aims to find a way to combine AI “world-models” with mathematical proofs that could guarantee that the system’s outputs were valid.

David Dalrymple, the machine learning researcher who is leading the ARIA effort, told me that the idea was to use advanced AI models to create a “production facility” that would churn out domain-specific control algorithms for critical infrastructure. These algorithms would be mathematically tested to ensure that they meet the required performance specifications. If the control algorithms pass this test, the controllers—but not the frontier AI models that developed them—would be deployed to help run critical infrastructure more efficiently.

Dalrymple (who is known by his social media handle Davidad) gives the example of the U.K.’s electricity grid. The grid’s operator currently acknowledges that if it could balance supply-and-demand on the grid more optimally, it could save ￡3 billion ($4 billion) that it spends each year essentially paying to have excess generation capacity up-and-running to avoid the possibility of a sudden blackout, he says. Better control algorithms could reduce those costs.

Besides the energy sector, ARIA is also looking at applications in supply chain logistics, biopharmaceutical manufacturing, self-driving vehicles, clinical trial design, and electric vehicle battery management.

AI to develop new control algorithms

Frontier AI models may be reaching the point now where they may be able to automate algorithmic research and development, Davidad says. “The idea is, let’s take that capability and turn it to narrow AI R&D,” he tells me. Narrow AI usually refers to AI systems that are designed to perform one particular, narrowly-defined task at superhuman levels, rather than an AI system that can perform many different kinds of tasks.

The challenge, even with these narrow AI systems, is then coming up with mathematical proofs to guarantee that their outputs will always meet the required technical specification. There’s an entire field known as “formal verification” that involves mathematically proving that software will always provide valid outputs under given conditions—but it’s notoriously difficult to apply to neural network-based AI systems. “Verifying even a narrow AI system is something that’s very labor intensive in terms of a cognitive effort required,” Davidad says. “And so it hasn’t been worthwhile historically to do that work of verifying except for really, really specialized applications like passenger aviation autopilots or nuclear power plant control.”

This kind of formally-verified software won’t fail because a bug causes an erroneous output. They can sometimes break down because they encounter conditions that fall outside their design specifications—for instance a load balancing algorithm for an electrical grid might not be able to handle an extreme solar storm that shorts out all of the grid’s transformers simultaneously. But even then, the software is usually designed to “fail safe” and revert back to manual control.

ARIA is hoping to show that frontier AI modes can be used to do the laborious formal verification of the narrow AI controller as well as develop the controller in the first place.

But will AI models cheat the verification tests?

But this raises another challenge. There’s a growing body of evidence that frontier AI models are very good at “reward hacking”—essentially finding ways to cheat to accomplish a goal—as well as at lying to their users about what they’ve actually done. The AI safety nonprofit METR (short for Model Evaluation & Threat Research) recently published a blog on all the ways OpenAI’s o3 model tried to cheat on various tasks.

ARIA says it is hoping to find a way around this issue too. “The frontier model needs to submit a proof certificate, which is something that is written in a formal language that we’re defining in another part of the program,” Davidad says. This “new language for proofs will hopefully be easy for frontier models to generate and then also easy for a deterministic, human audited algorithm to check.” ARIA has already awarded grants for work on this formal verification process.

Models for how this might work are starting to come into view. Google DeepMind recently developed an AI model called AlphaEvolve that is trained to search for new algorithms for applications such as managing data centers, designing new computer chips, and even figuring out ways to optimize the training of frontier AI models. Google DeepMind has also developed a system called AlphaProof that is trained to develop mathematical proofs and write them in a coding language called Lean that won’t run if the answer to the proof is incorrect.

ARIA is currently accepting applications from teams that want to run the core “AI production facility,” with the winner the ￡18 million grant to be announced on October 1. The facility, the location of which is yet to be determined, is supposed to be running by January 2026. ARIA is asking those applying to propose a new legal entity and governance structure for this facility. Davidad says ARIA does not want an existing university or a private company to run it. But the new organization, which might be a nonprofit, would partner with private entities in areas like energy, pharmaceuticals, and healthcare on specific controller algorithms. He said that in addition to the initial ARIA grant, the production facility could fund itself by charging industry for its work developing domain-specific algorithms.

It’s not clear if this plan will work. For every transformational DARPA project, many more fail. But ARIA’s bold bet here looks like one worth watching.

With that, here’s more AI news.

AI IN THE NEWS

Meta hires Scale AI CEO Alexandr Wang to create new AI “superintelligence” lab. That’s according to the New York Times, which cited four unnamed sources it said were familiar with Meta’s plans. The 28-year old Wang, who cofounded Scale, would head the new Meta unit, joined by other Scale employees. Meanwhile, Meta would invest billions of dollars into Scale, which specializes in providing training data to AI companies. The new Meta unit devoted to “artificial superintelligence,” a theoretical kind of AI that would be more intelligent than all of humanity combined, will sit alongside existing Meta divisions responsible for building its Llama AI models as well as its Fundamental AI Research lab (FAIR). That lab is still headed by Meta chief scientist Yann LeCun, who has been pursuing new kinds of AI models and has said that current techniques cannot deliver artificial general intelligence, which is AI as capable as most humans at most tasks, let alone superintelligence.

U.K. announces “sovereign AI” push. British Prime Minister Keir Starmer said the country would invest ￡1 billion to build new AI data centers to increase the amount of computing power available in the country by 20-fold. He said the U.K. government would begin using an AI assistant called “Extract” based on Google’s Gemini AI model. He announced plans to create a new “UK Sovereign AI Industry Forum” to accelerate AI adoption by British companies, with initial participation from BAE Systems, BT, and Standard Chartered. He also said that the U.K. government would help fund a new open-source data project on how molecules bind to proteins, a key consideration for drug discovery research. But Nvidia CEO Jensen Huang, who appeared alongside Starmer at a conference, noted that the country has so far lagged in having enough AI data centers. You can read more from The Guardian here and Financial Times here.

Apple to let third-party developers access its AI models. At its WWDC developer conference, the tech giant said it would allow its third-party developers to build applications that tap the abilities of its on-device AI models. But at the same time, the company did not announce any updates to its long-awaited “Apple Intelligence” version of Siri. You can read more from TechCrunch here and here.

OpenAI on track for $10 billion in annual recurring revenue. The figure has doubled in the past year and is driven by strong growth in its consumer, business, and API products. The number also excludes Microsoft licensing and large one-time deals. Despite losing $5 billion last year, the company is targeting $125 billion in revenue by 2029, CNBC reported citing an anonymous source it said was familiar with OpenAI’s figures.

EYE ON AI RESEARCH

“Reasoning” models don’t seem to actually reason. That is the conclusion of a bombshell paper called “The Illusion of Thinking” from researchers at Apple. They tested reasoning models from OpenAI (o1 and o3), DeepSeek (R1), and Anthropic (Claude 3.7 Sonnet) on a series of logic puzzles. These included the Tower of Hanoi, a game that involves moving a stack of different size disks across three pegs in a way that a larger disc never sits atop a smaller one.

They found that with simple versions of the games, standard large language models (LLMs) that don’t use reasoning, performed better and were far more cost effective. The reasoning models (which the paper calls large reasoning models, or LRMs) tended to overthink the problem and hit upon spurious strategies. At medium complexity, the reasoning models did better. But at high complexity, the LRMs failed entirely. Rather than thinking longer to solve the problem, as they are supposedly designed to do, the reasoning models often thought for less time than on the medium complexity problems and then simply abandoned the search for a correct solution. The most damning finding of the paper was that even when researchers provided the LRMs with an algorithm for solving the puzzle, the LRMs failed to apply it.

The paper adds to a growing body of research—such as this Anthropic study—that indicates that LRMs are not actually using logic to arrive at their answers. Instead, they seem to be conducting longer, deeper searches for examples in their training data that match the problem at hand. But they don’t seem able to generalize logical rules for solving the puzzles.

BRAIN FOOD

Should college students be made to use AI? Ohio State University has announced that starting this fall, every undergraduate student will be asked to use AI in all of their coursework. In my book, Mastering AI: A Survival Guide to Our Superpowered Future, I argue that education is one area where AI will ultimately have a profoundly positive effect, despite the initial moral panic about the debut of ChatGPT. The university has said it is offering assistance to faculty to help them rework curricula and develop teaching methods to ensure that students are still learning fundamental skills in each subject area, while also learning how to use AI effectively. I am convinced that there are thoughtful ways to do this. That said, I wonder if a single summer is enough time to implement these changes effectively? The fact that one professor quoted in this NBC affiliate Channel 4 piece on the new AI mandate said students “did not always feel like the work was really theirs” when they used AI, suggests that in some cases students are not being asked to do enough critical thinking and problem-solving. The risk students won’t learn the basics is real. Yes, teaching students how to use AI is vital to prepare them for the workforce of tomorrow. But it shouldn’t come at the expense of fundamental reasoning, writing, scientific, and research skills.

財富中文網(wǎng)所刊載內容之知識產(chǎn)權為財富媒體知識產(chǎn)權有限公司及/或相關權利人專屬所有或持有。未經(jīng)許可，禁止進行轉載、摘編、復制及建立鏡像等任何使用。

0條Plus

精彩評論

評論

撰寫或查看更多評論

請打開財富Plus APP

前往打開

熱讀文章

關注我們

人工智能能否用于控制安全關鍵系統(tǒng)？

撰寫或查看更多評論