In our forthcoming ICLR paper and accompanying code library, we propose and evaluate new protocols for scalable oversight: training a trusted, weak model to solve tasks beyond its capabilities by interacting with one or more stronger but untrusted models. Links and details below.
If you’re a researcher or developer working on tools for scaling public deliberation, collective decision-making, and AI-facilitated cooperation, apply for our workshop with @foresightinst, held in Seoul right before @icmlconf on Sunday 5 July 2026.
I'm excited to announce Principled Agents, a research nonprofit! Our aim is to develop principled solutions for core problems in AI alignment, ensuring the goals we give AGI are beneficial and safe. We're currently hiring and running a monthly discussion group, see below.
10 days left to submit to the 1st Trustworthy AI for Good (AI4GOOD) workshop at #ICML2026! @icmlconf
We're giving out multiple awards and travel funds sponsored by @schmidtsciences and @coop_ai:
🏆 Best Paper Awards (including targeted prizes for cooperative AI theme)
🏆 Top Reviewer Awards
✈️ Travel Funds
Submit here → https://t.co/RcOwxoPRtS
⏰ Deadline: May 3, 2026 (AoE) 📌 Notification: May 18, 2026 🔗(We extended our deadline to accommodate more submissions!)
Join us in Seoul for discussions bridging AI safety, social good, and governance with keynote speakers @Yoshua_Bengio, @OanaIgnatRo, @jzl86, @maksym_andr, and more!
Attending this week’s @iclr_conf in Rio? Join the ‘Cooperation in Decentralised Multi-Agent Systems’ social led by @coop_ai and @idai_institute. Taking place Friday 24 April, 4.15pm, Room 212. More details below. #ICLR. Drinks provided!
We are launching a new research fellowship! We are looking for 3 researchers – ideally PhD candidates or early-career postdocs* – who are passionate about AI and democratic governance.
https://t.co/vvWmhshuDi
*We are open to Masters-level students!
@mmuthukrishna and I are hiring a postdoc to join our labs at NYU! We're looking for someone excited to work on one of society's newly emerging and potentially generation-shaping challenges: the multi-agent alignment problem.
Hiring 🎉
Researchers to work on Chains-of-Thought faithfulness, reasoning verification, and AI monitoring robustness, some core questions for how oversight actually works in practice.
Looking for: 2 researchers (with PhD), 1 RA
DM or email with what you'd want to work on.
I cannot recommend this opportunity highly enough. These are the most important problems of our time and some of the people who are thinking most deeply about them. Apply!
My team (Frontier Strategy & Governance) at Google DeepMind is hiring.
We are bringing on researchers to lead our strategic analysis on frontier AI. Our mission is to provide rigorous foresight and actionable insights to help the world prepare for advanced AI.
We focus on concrete, longer-horizon challenges that fall outside the ordinary course of business: forecasting technical breakthroughs, anticipating geopolitical dynamics, contributing to the development of international safety standards, and planning scientific moonshots.
We also publish external research and convene experts. Recent public work covers lessons from historical technological revolutions, forecasting frameworks for frontier AI, Cooperative AI, and the limits of model-level governance. This builds on our team's track record of co-creating the Frontier Model Forum, the Frontier Safety Framework, and the AGI Safety Council.
We are seeking candidates with deep expertise in domains such as AI governance, international security, political economy, compute governance, forecasting, and institutional design. You will need the analytical skill to translate complex technical and political trends into internal briefings, external publications, and company initiatives. Experience briefing executive leadership is highly valued.
London is preferred, with hubs in NYC and the Bay Area. We are targeting mid-career and senior researchers, though exceptional early-career applicants are considered.
Apply here: https://t.co/J2LINpDpvH
🚨🏟️ We are on the lookout for a partner to help us build the Scaling Trust Arena 🏟️ 🚨
£10m contract. 2-page proposals. Apply by April 14th. Details below, link in reply👇
AI agents are increasingly negotiating, transacting, coordinating with other agents on our behalf. Right now, there's no rigorous way to know whether those interactions are secure. We're funding the tools to change that, the Arena is where they'll be stress-tested in a live, multi-agent adversarial environment. Anyone in the world will be able to participate in it, and compete for a portion of the multi-million pound prize pool.
For the right team, this is a chance to build critical infrastructure for AI security from the ground up, with real resources, lots of autonomy, and high stakes.
This will be extremely challenging, but also very fun 🤠🎢🏟️
Who You Are
We have no hard constraints on org type. You might be a startup, a consultancy, a research group, a frontier AI lab, a nonprofit, or a group mobilising specifically for this. What matters is that you're deeply technical, you're ambitious, you move fast, and you want to embed with us as part of the team.
Habermolt is a platform for a new kind of AI-facilitated deliberation 🦞 Try it out! (It's free – no API key or OpenClaw agent required.) Has been great fun watching @Oscarduys and @Jolow99 build this with @bakkermichiel :)
1/8 Can AI help us disagree better? Today we're launching Habermolt — a platform where your AI agent learns your views and deliberates with others on your behalf.
https://t.co/ofHPzL7QDE 🦞
🧵
People seem to be arriving at a similar conclusion from various angles:
- AGI may not emerge as a monolith, but as a distributed "patchwork" system of coordinating sub-AGI agents [1]
- Static benchmarks aren't enough; we need multi-agent ones to capture emergent risks and capabilities [2]
- As creation costs go to zero, human verification bandwidth becomes the ultimate economic bottleneck, making verification infrastructure one of the most important public goods for the AI era [3]
- Automated proof-generation and verification can act as the unlock for this bottleneck [4]
- New kinds of strategic interactions between agents are emerging, reaching cooperative "program equilibria" inaccessible in traditional settings [5]
- Coasean transaction costs are about to collapse, changing our society [6]
There is an elephant here that we're all touching. Our @ARIA_research initial £50m r&d programme Scaling Trust is our unifying thesis, on the trust infrastructure needed for an agentic world and how to steer us there.
Before we set out on our journey over the next ~3ish years, we're hiring an additional individual to complete our team. Your role will essentially be one of Technical Director, steering our efforts technically and co-owning our research and engineering agenda. You will be doing incredibly meaningful work, in a highly interdisciplinary environment, and at the cutting edge of a technology that is shaping up to be the most defining of our century, if not of humanity.
We are building for the highest possible impact. After all, this is what @ARIA_research is about, moonshot r&d projects that change the world. We want to build technology as impactful as the invention of the internet once was in another r&d programme at DARPA, to start new academic fields and academic lineages for the next century, and to catalyze lasting positive change for the world.
For the right person, this is a bat signal 🦇, few places will offer you as much leverage to effect positive change on the world, intellectual stimulation, and fun.
Join us! We want to onboard someone asap as we build out our initial portfolio, and are willing to move fast. Apply here: https://t.co/kjTgSecUq2
Any questions on the role, please shoot me a DM or reply in comment here!
---
[1] Distributional AGI Safety @weballergy@sebkrier@FranklinMatija et. al -- https://t.co/apRl6WZUcm
[2] Agents of Chaos @NatalieShapira et. al — https://t.co/T4xIkahcm2
[3] Some Simple Economics of AGI @ccatalini et. al — https://t.co/Oa1dTKWpu7
[4] When AI Writes the World's Software, Who Verifies It? @Leonard41111588 — https://t.co/dgMyby3tFl
[5] Evaluating LLMs in Open-Source Games @SwadeshSistla et. al — https://t.co/bTWFw4FMpt
[6] Coasean Bargaining at Scale @sebkrier — https://t.co/oNOsUUvvHh
The Cooperative AI Summer School 2026 'Expression of interest' applications are now open! If you're an early-career professional studying or working in cooperative AI, apply to join us in Canada this August for an exciting intensive programme.
We're hiring! Join us in making collective intelligence the obvious path - currently recruiting for two @collect_intel roles:
Lead Product Engineer & Product Manager. https://t.co/eEdJkd3kYj
We're building the infrastructure that gives people across the globe meaningful input and agency into how AI systems are developed and governed. We combine large-scale deliberation, participatory evaluation, and institutional partnerships in a way no lab, regulator, or civil society organization can achieve alone.
We're a small, high-leverage team backed by leading foundations + working with top AI labs and governments to ensure AI development expands democratic capacity rather than undermining it.
We are announcing a new ~£50m research funding programme to make AI agents in the wild secure.
The call for proposal is now open for £300k-£3m grants until March 24, 2026.
(Programme: Scaling Trust at @ARIA_research - see thread)
Apply for our new ‘Introduction to Cooperative AI’ course to master the foundations and gain confidence to work in the field. Running online from 9 March - 1 May 2026. Link to apply below.
For the first time, the annual International AI Safety Report highlights failure modes specific to multi-agent systems, including miscoordination, conflict, and undesired collusion between agents, as well as the distinct challenges these pose for policymakers.
We're looking to hire 15+ technical staff this year globally (remote; or in-person in Berkeley, CA or Singapore) to work on core problems in trustworthy & secure AI -- check out roles below 👇
Our new paper, Permission Manifests for Web Agents, is out on arXiv! It's the first paper of the Lightweight Agent Standards Working Group.
In a sentence: robots.txt for AI agents
https://t.co/wbEWjEUwmI
AI agents have no way of knowing what interactions are allowed on webpages, which means they often break TOSes. The typical solution is for websites to block all AI agents (see @Cloudflare). Agents then try to circumvent anti-AI blockers, and so on.
The solution to this arms race is a standardized, machine-readable document that specifies how an AI agent is allowed to interact with a webpage.
agent-permissions.json allows webpages to specify both fine-grained HTML rules (“don’t click this button”) and general guidelines (“when registering an account, use _bot at the end of your username”). It also supports specifying alternative MCP, A2A and OpenAPI endpoints for the webpage.
Just as robots.txt addressed the problem of specifying rules for crawlers, agent-permissions.json is the first step towards specifying rules for UI interactions. It is also designed to allow AI-friendly webpages to explicitly welcome interactions from agents, which boosts visibility
You can find the standard here: https://t.co/V4mudoxYfU
We also released a Python library (https://t.co/kfDgDfbbYn), a web tool to generate agent-permissions.json files for your website (https://t.co/KC7GWC7bri), and a Python integration demo (https://t.co/F4ZuJm1bjJ).
Huge thanks to:
@_achan96_@XinxingRen@lrhammond
Jesse Wright
@rickywanga42@tizianopiccardi@nfcampos@TobinSouth
Jialin Yu
@alex_pentland
Philip Torr
@jiaxin_pei