You make great points about the trade-offs here. I agree with each individual point yet I conclude the opposite.
I believe that fundamentally we have to have regulatory authority to stop unsafe releases for frontier risks. Enshrining transparency even with independent audit is insufficient.
And given the Trump EO is at pains to say:
Nothing in this section shall be construed to authorize the creation of a mandatory governmental licensing, preclearance, or permitting requirement for the development, publication, release, or distribution of new AI models, including frontier models.
I am skeptical that pre-empting state laws will get to a regime that actually stops unilateral unsafe actions as frontier AI races ahead.
I was founding VP AI Engineering at @VectorInst in 2020. It's good that Canada's new AI strategy puts another $50M into the Canadian AI Safety Institute to study risks and evaluate models, but studying risk is not requiring proof of safety. "Middle powers" should go further: regulation that makes frontier developers show evidence of safety before release, not after. That's the gap as the race accelerates.
It's more urgent than ever that we require strong safety cases to show acceptable risk from increasingly autonomous AI.
I wrote about this: https://t.co/qwlESjlvUN
Anthropic's data in this thread about an 8x increase in code and increasing sophistication of AI engineering/research handling open ended problems and improving on human in handling hard research problems show how close RSI may be.
It's more urgent than ever that we require strong safety cases to show acceptable risk from increasingly autonomous AI.
I wrote about this: https://t.co/qwlESjlvUN
Anthropic's data in this thread about an 8x increase in code and increasing sophistication of AI engineering/research handling open ended problems and improving on human in handling hard research problems show how close RSI may be.
@deanwball To be clear, I am concerned about highly autonomous AI well before fully autonomous companies, but I'm curious at what point there's too much autonomy. Is it just not having a human accountable as CEO even if all decisions are made by AI, or is there an earlier point?
Thanks for the comprehensive analysis.
I very much agree on the welcome parts. It's great OpenAI is highlighting RSI and loss of control as major risks. I also appreciate flagging autonomy and alignment as raising concerns. The return to transparent communication about these threats and the style of engagement is real and welcome.
My disagreement is structural, and it comes from the paper contradicting itself. The diagnosis is candid: it admits policymakers have limited visibility into "whether safeguards are keeping pace," and it calls RSI possibly the defining governance challenge of the decade. Then the recommendations quietly assume the opposite. They presume mitigations are already adequate:
* Companies "implement appropriate safeguards; and explain why any residual risks are appropriately managed." That is an explanation standard. The developer narrates why it is fine and nobody has to be convinced.
* CAISI's job is to "recommend mitigations, not to approve or block deployments,"
* "developers should remain responsible for deployment decisions."
* If CAISI runs out of bandwidth, "developers should be permitted to deploy without penalty."
Every default points at release. The burden falls on anyone trying to stop a deployment, never on the developer to prove the risk is limited. That only makes sense if you already believe the mitigations are adequate and will remain so.
They are not. There's significant basic science left before we can do reliable risk analysis, let alone reliable mitigation.
The burden of proof needs to be with the developer to show safety. We need safety cases that provide structured arguments that these major risks are contained within defined levels, reviewed by an independent body with the power to hold a release.
This is why the bar has to go up and why frontier safety is far from solved. It is also why the state laws still matter. They are doing real work as laboratories, and preemption now would be a serious mistake, as you noted.
So building up and investing in CAISI is important. But I very much disagree with "CAISI's role should be to conduct evaluations and recommend mitigations—not to approve or block deployments." We absolutely need an institution (whether CAISI or also delegated to private auditors) that has the authority to do exactly this. And an annual audit cadence is insufficient - public deployments should be audited. As risks increase, training or private deployments will need to be audited too.
I think that becoming aware of risks from AI and the massive increase in data center buildout represented learning new facts and a change in conditions. Although you can certainly characterize the political movement that ensued as a “fashion” change (arguably all three factors are at play)
Some personal thoughts on President Trump's new executive order on AI --
1. It's really great to see President Trump taking these risks seriously. It's a vindication of the idea that the government will respond to risks as they emerge.
2. This is important because this is not a narrow cyber issue. The EO focuses too much on cyber risks to the exclusion of other national security concerns. Mythos wasn't built to do cyber - it was trained in a general-purpose way and just happened to get superhuman cyber capabilities. And Mythos is just the beginning. Companies are clear we are building towards superintelligent AI that outclasses all human experts combined at all tasks. We have no plans to be able to control such a superintelligence. The framework being started by the EO needs to be built to consider far more risks than just cyber.
3. Also evaluations themselves won't be enough - the US government also has a national security interest for wider-ranging visibility into what is happening in AI companies. The main risks of AI systems are not 30 days before commercial release. Risks will occur first and foremost from AI systems that are only available internally within an AI company.
For example, it makes sense that the Air Force would want to test a fighter jet before they fly it, because if you fly it and the fighter jet crashes because it is built incorrectly, then many people will die. However, as long as the fighter is just sitting on the runway, nothing bad can happen. But now imagine you had a fighter that could just take off and fly itself without human authorization and launch missiles and crash before anyone realized what had happened. That kind of fighter jet would need a very different kind of security measures. This may sound crazy for a fighter jet but it is already beginning to happen with the most advanced AI. AI systems can take actions, including unintended and unauthorized actions, and are increasing in their sophistication to do so. The government deserves to know what capabilities AIs have at the same time companies know, not just 30 days before commercial deployment.
4. We also need to focus on the security of the AI models themselves, including internally. What happens if an adversary steals the AI model and then can use it against us? An employee or contractor with privileged access, possibly in collusion with an external actor such as a foreign intelligence service, could steal an internally-deployed AI model. We don't have good defenses against this yet, and the government isn't putting enough pressure on AI companies to ensure this happens.
Surely China, Russia, or North Korea would want access to Mythos and the fact that both Mythos has been illicitly accessed by random people on Discord and Mythos was first learned from the internet via an unauthorized leak do not inspire confidence.
5. We also have the question about what to do if evaluations find risks that companies are not mitigating well on their own. Some of these risks we have no plans for even how to mitigate them. Will it be possible, in these ultimate scenarios, for the government to be in a position to tell the companies that some aspects of their development may be too dangerous and get them to halt or change practice? Currently we have no framework for this.
6. The ideal response to all of the above is Congressional action. It's great to see the White House leading where they can, but so much of this can only come from Congress. So far Congress is way behind, and that's unfortunate.
@OpenAINewsroom It's great to see progress on improving AI safety laws as AI advances - requiring audits is an important step and good that OpenAI endorsed this law. As AI advances we will need to keep advancing regulations to have robust safety cases against catastrophic risks.
Agreed. The mechanism is what matters: telling apart a model that's consistent because it internalized the objective from one that's consistent by inductive bias or parroted CoT patterns is the essence. Glad that's the direction you're taking, and I appreciate your contributions on measurement that is foundational.
@peterwildeford Indeed their plan is out in the open.
Needless to say this is reckless. We should require strong safety cases about alignment and control before advancing along this line.
I wrote more about this: https://t.co/Qcs1J19Ilk
METR recently published an important report on risks for losing control of advanced AI. Their CEO Beth just wrote her perspective on what this means (spoiler alert - risks are unacceptably high!)
Our report focuses on claims that are (1) solidly defensible and (2) generally agreed within METR. Here I’ll give some personal opinions on how we should feel about the state of AI risk, and the IMO most important limitations of the report.
Thanks for the great work at METR and emphasizing the risks in the current situation.
We need to move from voluntary self-governance where capable third parties like METR are dependent on labs to a system with mandatory safety cases and required independent audit before release
I wrote more about this at https://t.co/ZmNynpgMc2