We're excited to share that our agent, Maestro, drafted solutions to all 12 problems from ICPC 2025 World Finals in ~2 hours - using current models, no human involvement, no internet access. We deeply respect the human teams' extraordinary dedication. Note: no official validation
The Rebooting State Capacity Hackathon is complete!
SoTA’s developers, engineers, researchers and technologists worked alongside civil servants to build solutions to critical government challenges this past weekend. Kicking off with interactive sessions led by @PalantirTech and @PUBLIC_Team, there was a huge amount of energy dedicated to unlocking progress and leveraging tech to boost state capacity.
Brilliant testimonials from frontline operators who tried out the prototypes, and now meetings being booked across UK government to discuss next steps…
Congratulations to the winners:
1. Treasury Busters (£1,000) - Single source of truth to build and approve/reject business cases for major infrastructure projects in months, not decades.
2. DVSA Reimagined (£500) - Streamlined process for most young people’s first touchpoint with the government, getting their driving licenses.
3. A1 Voice (£250) - Smart voice-mailbox for NHS hospitals, managing the over 10,000 phone calls per day to clinical departments.
And to the winners of the @BasisCapitalLtd Prize (another £1,000) for Outcompeting the Government: A1 Voice again, who managed to engage 17 departments across 6 trusts!
Some honourable mentions:
- SCYLLA - Remote seabed sensing for critical undersea infrastructure using backscatter
- Hacknee - Collective intelligence for local council decision-making via prediction markets
- Fighting Crime, One Query at a Time - Automated analysis and reporting to democratise access to crime data
- Pranav - Visual odometry for mapping GPS-denied environments
- Seaguard - Small-vessel maritime detection using computer vision
Thanks to our partners for Rebooting State Capacity: @10DowningStreet, @i_dot_ai, @delian_ai, @BasisCapitalLtd, @PalantirTech, @elevenlabs, @PUBLIC_Team, and SCOPE, along with all the departments, agencies, and individual civil servants who provided challenges as a foundation for the hackathon.
"Agency > Intelligence"
@karpathy nailed it, and after 18 months building Maestro, we agree. The real AI leap isn’t just smarts—it’s agency: the ability to act independently, turning assistants into partners.
It's always great hosting @AITinkerers London meetups right after a new model drops...
Huge thanks to @rebecca_harbeck from @AnthropicAI, as well as the @iGent_AI team @MSzummer and @samshapley for giving impromptu talks with tons of learnings from early access Claude Sonnet 3.7.
We also got to see @HarryCoppock from the @AISecurityInst live demoing 3.7 hacking into a docker container 🫢
And as always, we had some fantastic product, behind-the-scenes and benchmark beating agent talks from Emma Burrows, @moeadham and Sergei Petrov.
Huge thanks to team @localglobevc and @ferdisigona for making it happen!
@RichardSocher charging per hour/outcome/productivity boost compared to a human feels like the natural cost for an agent. 1hr of agent time could be 1 day, 1 week or even 1 month of human time depending on the quality of the outcome.