📷📷📷New paper! (with @OpenAI) 📷📷📷
We trained weight-sparse models (transformers with almost all of their weights set to zero) on code: we found that their circuits become naturally interpretable! Our models seem to learn extremely simple, disentangled, internal mechanisms!
Here is re-post of an internal post:
We have been working with the DoW to make some additions in our agreement to make our principles very clear.
1. We are going to amend our deal to add this language, in addition to everything else:
"• Consistent with applicable laws, including the Fourth Amendment to the United States Constitution, National Security Act of 1947, FISA Act of 1978, the AI system shall not be intentionally used for domestic surveillance of U.S. persons and nationals.
• For the avoidance of doubt, the Department understands this limitation to prohibit deliberate tracking, surveillance, or monitoring of U.S. persons or nationals, including through the procurement or use of commercially acquired personal or identifiable information."
It’s critical to protect the civil liberties of Americans, and there was so much focus on this, that we wanted to make this point especially clear, including around commercially acquired information. Just like everything we do with iterative deployment, we will continue to learn and refine as we go.
I think this is an important change; our team and the DoW team did a great job working on it.
2. The Department also affirmed that our services will not be used by Department of War intelligence agencies (for example, the NSA). Any services to those agencies would require a follow-on modification to our contract.
3. For extreme clarity: we want to work through democratic processes. It should be the government making the key decisions about society. We want to have a voice, and a seat at the table where we can share our expertise, and to fight for principles of liberty. But we are clear on how the system works (because a lot of people have asked, if I received what I believed was an unconstitutional order, of course I would rather go to jail than follow it). But
4. There are many things the technology just isn’t ready for, and many areas we don’t yet understand the tradeoffs required for safety. We will work through these, slowly, with the DoW, with technical safeguards and other methods.
5. One thing I think I did wrong: we shouldn't have rushed to get this out on Friday. The issues are super complex, and demand clear communication. We were genuinely trying to de-escalate things and avoid a much worse outcome, but I think it just looked opportunistic and sloppy. Good learning experience for me as we face higher-stakes decisions in the future.
In my conversations over the weekend, I reiterated that Anthropic should not be designated as a SCR, and that we hope the DoW offers them the same terms we’ve agreed to.
We will host an All Hands tomorrow morning to answer more questions.
This is what I currently believe to be the case and am advocating internally to release more information about as soon as feasible. If we later learn this is not the case, then I will advocate internally to terminate the contract.
dawg you are not going to be part of the permanent underclass. that underclass already exists and it does not live in a studio apartment in San Francisco, it's making bricks in debt slavery in Pakistan
@ChowdhuryNeil wait I think I experienced step function improvement from switching to our harness a month ago
I agree ui is worse but maybe it’s worth?
I’ve wanted Claude to have an ad supported tier for years.
Today, thanks to the Opus 4.6 API, Claude with Ads is here.
Please enjoy intelligence too cheap to meter.
All of this debate between the labs makes me so angry I might grab a Heineken™ to relax. Watching friends (who normally kick back and debate at an SF tech party with some Heineken™s) argue over silly differences is such a waste of energy.
[6-Pack of Heineken™ delivered TODAY]
Labs like @OpenAI also hire researchers straight out of undergrad, like @kevin_wang3290, though the bar is high. Kevin was highly recommended by his advisor and was first author on a NeurIPS 2025 paper. There's a lot of bad NeurIPS papers, but we could tell this was a great one. (Indeed, after he joined OpenAI his paper was one of 4 out of 5,290 to receive a Best Paper award.) His advisor's recommendation counted for a lot because it can be hard to evaluate a researcher just based on a resume or even a paper.
https://t.co/ovWii2DYQy
Artificial Intelligence is enabling us to construct high-fidelity models of the imagination, excreted over the past couple hundred years into the material plane through media, now returning back to its rightful place as the Stuff of Dreams. This is the Human Soul, Exteriorized.
@garrytan@robertwiblin Another way to argue against free markets here is that ASIs “goodhart capitalism”
Productivity doesn’t have to be correlated with human wellbeing in these extreme circumstances.
It’s up to us to shape the competitive landscape to make ASIs help people.
@garrytan@robertwiblin Notably this is also what the “good” scenarios look like!
human irrelevance and post-scarcity utopia are fairly close together. Intelligent regulation by the government and responsible deployment by the private sector are necessary to thread this needle.