I'm grateful to the many technical & non-technical staff who prioritise public service and make AISI a world-leading—not just government-leading—technical group. As the stakes rise, we rlly need talented researchers working on behalf of public interest. https://t.co/e51h4ctDnj
claude helped me make a personalized anti-ai-slop magazine. (the irony is not lost on me.)
i was in a cafe reading the terrible very-ai-written vanity fair article (cough cough you know which one) and thought: journalism is going down the drain, i am wasting my time reading this AI-written garbage. the only good thing about this is that its on physical paper.
yet there IS good writing out there, it's just online. it's written by real humans, many of whom i know personally, and i would love to spend more time deeply and calmly engaging with their work. just. i don't want to do that on my phone and stare at another screen.
i want it in print.
lo, a large number of claude code tokens and 10 days later: a personalized magazine, assembled from my personal subscriptions. includes a section of fashion photos as palette cleanser (my personal interest). shout out daniel roseberry at schiaparalli.
it's a few weeks old now, since printing and shipping takes time, but that's fine: the good posts are evergreen. this issue includes writing from Sasha Chapin, @ajeya_cotra, @_brianpotter, and @NatalieRCargill
claude in particular is obsessed with Natalie's writing. it tried to print nearly every piece she's written for Inkhaven lol. (to be fair: i'm obsessed with Natalie's writing too)
i gave claude some liberties in choosing a cover photo, a name for the series, and then some freedom in choosing the articles (though i did intervene somewhat due to some curation taste).
honestly this was so fun. when i was a kid, i made a coloring book as a christmas gift for a friend, using Microsoft Publisher (!!! i bought the software in a box) and got it printed at Staples. my friend loved it.
this felt like that but on steroids. claude code opens up a world of creative projects to me previously totally inaccessible. this felt so fun and joyful.
second issue already ordered. they're $11 each and shipping is $18 so not cheap, but i WILL keep doing this. catch me at the mill, not on my phone.
We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.
Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.
@davidshor@samhaselby@arindube So the acceleration of further inequality is reducing but inequality itself is still rising? Or is inequality also decreasing since 2019?
📜 Since there’s renewed interest in how AI could help with governance, here are 14 specific government processes where AI agents could make a measurable difference today:
Impenetrable forms and applications: citizens face complex, jargon-filled forms that cause them to miss benefits or fail to comply with regulations. AI can replace forms with plain-language conversations that extract data from documents, calculate eligibility, and only ask relevant questions. In the US, citizens spend an estimated 6.5 billion hours per year on federal tax compliance. The IRS sends you no pre-filled return despite already having your W-2s and 1099s. An AI agent could pull all income data the government already holds, pre-populate a return, flag deductions you're likely eligible for based on your profile, and file - reducing the process from hours to a single review-and-confirm step.
Regulatory bloat: guidance layered on regulations layered on statutes creates thousands of pages of rules that no person or caseworker can realistically navigate. Rules become too complex and get applied selectively by frontline workers. Agents can be used to map entire regulatory regimes, flag redundancies and conflicts, and let policymakers simulate how proposed rules would actually perform before enacting them. Stanford's RegLab built STARA, an AI system that surveyed San Francisco's municipal code and identified hundreds of outdated reporting mandates, which resulted in a 351-page ordinance to eliminate or consolidate more than a third of the city's 528 mandated reports.
Obsolete code and IT systems: the US Social Security Administration runs on 60-million-line COBOL codebases from the 1980s; the IRS processes returns on systems from the 1960s; the World Bank's own internal review found its siloed divisions (IFC, IDA, IBRD) couldn't communicate across systems and its bureaucracy resisted modernisation. In each case, no one internally understands the code, so agencies can't fix a bug without months of waiting and enormous contractor fees. Agentic coding tools let internal teams point an AI at a legacy codebase and start making changes themselves.
Fraud and improper payments: after Hurricanes Katrina and Rita, FEMA distributed $6bn in relief with $600m to $1.4bn in improper/fraudulent payments according to GAO. During COVID, the US lost an estimated $100-200bn to fraudulent unemployment insurance claims alone, many filed by bots. As bad actors adopt AI to generate synthetic identities, forge documents, and file claims at scale, that gap will widen fast. Agencies need their own AI agents doing real-time cross-referencing of claims against income data, identity records, and behavioural patterns.
Siloed, department-centric service delivery: around 600,000 people leave US prisons each year. Each must separately navigate the Bureau of Prisons (release paperwork), SSA (Social Security card), state DMV (ID), Medicaid (healthcare), SNAP (food), HUD (housing), American Job Centers (employment), and a parole office; each with its own application, eligibility rules, and case system. These dependencies are sequential: without ID you can't get benefits, without benefits you can't stabilise housing, without housing you can't hold a job. An AI agent could intake one person's situation at release, determine eligibility across every level of government, and file applications in the right dependency order.
Identity verification as a bottleneck to service access: 800m people worldwide can't legally prove their identity according to the World Bank, mostly in Sub-Saharan Africa and South Asia. Without ID you can't open a bank account, receive a cash transfer, or access most government services. India's Aadhaar is a nice positive example: 1.4bn biometric IDs, 523m new bank accounts, and a claimed $11bn saved by eliminating ghost beneficiaries; but this took a decade of state capacity to build and still fails often enough to lock out legitimate users. AI agents could compress this by cross-referencing whatever documents a person does have (a utility bill, a phone number history, a community attestation etc) against available records and flagging confidence levels for human review.
Benefits eligibility screening: the US has over 80 federal means-tested programmes, each with its own application and documentation requirements. A single mother qualifying for SNAP, Medicaid, CHIP, WIC, EITC, Section 8, and childcare assistance faces what is effectively seven separate bureaucracies. An AI agent could intake one life-situation description, determine eligibility across every programme simultaneously, pre-fill and submit applications in parallel, and flag benefits cliffs (where a small income increase would trigger a sharp loss in support) before they hit.
Building permit approvals: getting a construction permit in many US cities takes 3–12 months of back-and-forth between applicants and planning departments, often over PDF submissions reviewed manually against zoning codes. An AI agent could parse submitted plans against the local zoning and building code, flag non-compliant elements immediately, and return a preliminary approval or specific revision list within hours instead of months. A related case study: DeepMind recently helped the UK government translate mountains of old paper maps, PDFs, and scanned documents into usable data for modern planning systems with the Gemini-powered ‘Extract’ tool.
Public records requests (FOIA): federal agencies have backlogs of hundreds of thousands of FOIA requests, with median response times stretching to months or years. Staff manually search filing systems and redact sensitive information page by page. An AI agent could search document repositories for responsive records, auto-redact exempt information (personal data, classified material, deliberative process content), and draft a release package for human sign-off. However this only works where records are digitised and searchable, and much of the government still runs on fragmented legacy systems where documents aren't centrally indexed…
Court scheduling and case management: state courts lose enormous time to scheduling conflicts, continuances, and manual case tracking. In many jurisdictions, hearing dates are still set by phone or in-person. An AI agent could manage the full docket — auto-scheduling based on judge availability, attorney conflicts, and case priority, sending reminders, and rescheduling continuances without human clerk intervention. Over time you could also start exploring automating some low-value claims through novel arbitration pipelines, freeing up court capacity for more consequential cases.
Business registration and licensing: starting a business in most jurisdictions requires navigating 5–15 separate registrations: state incorporation, EIN from the IRS, state tax registration, local business licence, zoning compliance, health permits, liquor licences, professional licences, etc. An AI agent could take "I want to open a restaurant serving alcohol at [address] in Brooklyn," query every relevant federal, state, and city database, produce the complete permit checklist in dependency order, pre-fill each application with the business details, and flag the long-lead items (e.g. liquor licence) that need to start immediately.
Social worker caseload documentation: child protective services and adult social care workers spend the majority of their time on paperwork rather than with clients: writing visit notes, filing reports, updating case management systems. For every case, caseworkers complete roughly 400 forms totalling ~2,500 pages (multiplied across the 24–31 cases they typically carry simultaneously). An AI agent could listen to (or read transcripts of) a home visit, auto-generate the structured case note, update the system of record, and flag any safeguarding triggers, giving caseworkers their time back for actual care.
Medicare/Medicaid claims adjudication: CMS processes over 1bn claims per year, with complex rules about covered services, bundling, medical necessity, and provider eligibility. Improper payments run to tens of billions annually, and 77% of these were due to insufficient documentation, not fraud. In parallel, Medicare Advantage denies 17% of initial claims, yet 57% of those denials are overturned on appeal. Agents could adjudicate straightforward claims automatically against the coverage rules, flag anomalous billing patterns in real time, and route only genuinely complex cases to human reviewers.
Public comment synthesis for rulemaking: when a federal agency proposes a new rule, it often receives thousands to millions of public comments (the FCC received 22 million on net neutrality). Staff must read, categorise, and respond to each substantive comment. This may well get worse as people use agents to submit plausible-looking comments multiple times. Agents can help the government filter through these, cluster comments by theme, identify unique substantive arguments, flag form-letter campaigns, and draft the agency's response-to-comments document (a task that currently takes teams of lawyers months).
Tech companies pay millions of dollars for their employees and then stick them in open-plan offices that make it nearly impossible to get work done. Best strategy for poaching employees is probably to just offer them an office with a door.
Tech companies pay millions of dollars for their employees and then stick them in open-plan offices that make it nearly impossible to get work done. Best strategy for poaching employees is probably to just offer them an office with a door.
Krass. Das kann die Wissenschaft verändern. KI Agenten, der sogenannte AI Scientist, haben in einer Studie die komplette Durchführung einer Studie von der Idee bis zum Experiment bis zur Publikation der Studie durchgeführt. Ganz ohne Menschenkontrolle. https://t.co/MTeAXkrRPG
Here is the range of credible dates for AGI, across all forecasters at Metaculus.
This is a huge range of uncertainty. The median date is 2033, but their 80% confidence interval is from 2026 to 2067 — between 0.25 years and 41 years.
2/
Our new Google Threat Intelligence Group (GTIG) report breaks down how threat actors are using AI for everything from advanced reconnaissance to phishing to automated malware development.
More on that and how we’re countering the threats ↓ https://t.co/NWUvNeBkn2
i am glad this chart is public now because it is bananas. it is ridiculous. it should not exist.
it should be taken less as evidence about anthropic's execution or potential and more as evidence about how weird the world we've found ourselves in is.
In a world with abundant intelligence, AI agents will not just replace humans in existing workflows, but make new kinds of work possible; work that was previously too slow or expensive to perform at scale.
In cyberdefense, automated digital forensics, i.e. deep security investigations, will become ubiquitous.
Today we’re launching Asymmetric Security, the first full-stack AI Digital Forensics and Incident Response company. We're working with the worlds largest global cyber insurers, and our AI platform has already helped incident responders handle hundreds of cyber attacks. Building on this experience, we’re creating realistic training scenarios and evaluations to improve both our AI cyber-defense capabilities and those of frontier labs.
Our mission is to accelerate AI cyberdefense to address the security challenges of the AGI era.