A bunch of startups are trying and sell you a subscription-based code review tool.
Just run `codiff -w <url to pr>`. It's better than any of them.
We started Theorema a few months ago with one vision: autonomous science. Self-driving labs that design, run, and interpret their own experiments, with the AI doing the science.
Today we share our first preprint.
First, the problem it addresses. Enzyme cascades are how much of modern medicine, advanced materials, and green chemistry is produced. Chain several enzymes together and they perform in a single pot what would otherwise take a chemist many separate steps. The difficulty is tuning them. Enzyme ratios, pH, temperature, and buffer composition all interact at once, and the parameter space grows combinatorially. The conventional approach is a scientist running reactions one at a time for months.
Loschmidt Labs spent years building CascadeMAP, a self-driving microfluidic lab. It generates thousands of nanoliter droplet reactions and uses Bayesian optimization to converge on the best conditions without supervision. Over seven days it ran roughly 220,000 reactions across 7,400 conditions, with no one in the room.
The platform was validated on two very different cascades: (i) glycerol detection pathway (monitored by fluorescence) and (ii) 1,2,3-trichloropropane degradation pathway (monitored by label-free Raman spectroscopy), demonstrating its versatility across detection modalities and application domains.
Then we added Theorema on top of it.
If CascadeMAP is the experimental engine, Theorema is the scientist directing it. The meta layer. Our multi-agent system designed the experiments and the optimization strategy, then analyzed 11 GB of raw results across 23 campaigns, reconstructed what had happened and why, and recommended the next round. CascadeMAP ran the fast loop within each campaign. Theorema closed the slower loop between campaigns: design, interpretation, redesign. That is the work that has always required a principal investigator. And the value of that loop was concrete. Theorema saw that the search had concentrated on high-performing pockets rather than mapping the whole space, identified the variables that actually drove performance, and explained why the landscape held so many local optima. That is exactly the read needed to design a sharper next campaign.
That's what I mean by autonomous science. It is not one AI doing everything, but several roles working together. Theorema can supply all of them, or join your existing models and provide the reasoning layer at a scale, speed, and depth no human team can match.
A few months in, the system we set out to build is running real wet labs.
Authors: Michal Vašina, David Kovář, Martin Kizovsky, David Lacko, Pavel Vaňáček, Maximilian Herich, Eduard Volf, Lukas Drdla, Sona Cabalova, Pavlina Sikorova, @MichaelJirasek, Pavel Solansky, Jan Ježek, Ota Samek, @fdousek, Hynek Walner, Pavel Zemanek, Andrew deMello, Zdenek Pilat, @JiriDamborsky, Stavros Stavrakis, Stanislav Mazurenko, Zbynek Prokop
This work is a joint effort of @MasarykUni Masaryk University, St. Anne's University Hospital Brno, @ETH ETH Zurich, @CzechAcademy the Czech Academy of Sciences, and @theorema_ai Theorema. Many thanks to our partners for a fantastic collaboration.
If you're looking to accelerate your R&D program, talk to us.
(preprint link in comments.)
Andrej Karpathy spent 2h showing how he actually uses AI day to day
he's a co-founder of OpenAI and led AI at Tesla, so when he shows how he works, it’s worth watching
and the whole session is just him telling the machine what he wants in simple terms, like he's briefing a coworker
watch what's actually happening the entire time:
> he describes the task in normal words
> it goes off and does the work
> he glances at the result and nudges it with one more sentence
that's the whole skill, and you've had it since you learned to talk
the only gap between that and a worker that runs on its own is handing that sentence a schedule and the tools to act
check his work, then build the version that keeps working when you stop
How do we automate business analytics with Claude?
New blog post covering our best practices for skills, data foundations, and evaluations when building agents to perform data analysis:
https://t.co/mfEJMAQFBU
Excited to share how Anthropic's data team has automated 95% of business analytics queries with Claude. Blog post covers how we approach evals, ablations, and online validation!
VoidZero is joining Cloudflare.
Our mission stays the same: to make JavaScript developers more productive than ever before. Vite, Vitest, Rolldown, Oxc, and Vite+ remain MIT-licensed. Evan and the VoidZero team will continue leading them.
Cloudflare shares our commitment to open source. Together, we can keep investing in the tooling developers rely on every day, while bringing the Vite ecosystem and Cloudflare’s platform even closer together.
I am building a team.
If you're really really really good at building stuff, design, filmmaking, writing, pushing the models to their limits, or just making people care about a product at mass, certainly reach out.
Let's collab + make stuff.
Details:
https://t.co/Xo4rlaBNld
We’ve shipped a security-guidance plugin for Claude Code that helps identify and fix vulnerabilities as you’re writing code.
Available for all Claude Code users. Install from the plugin marketplace (/plugins).
why agents need VMs, not containers
with David Crawshaw, ex-CTO & co-founder of Tailscale
now co-founder and CEO of exe - a new up-and-coming cloud provider
Timestamps
(0:00) why build a new cloud?
(2:07) why Docker isn't enough for agents
(12:28) why AI-friendly is developer-friendly
(20:32) why VMs are the right abstraction (and the serendipity of just dropping an idea prompt from your phone)
(28:30) the exorbitant price of IOPS in the cloud (32:21) Cloud Discounts
(33:40) the rise of self hosting
(41:25) Shelly and AI ops agents
(48:10) the hard problem with AI SREs
(53:00) parting thoughts and early EC2’s noisy neighbor shenanigans
The industry has seen an unprecedented wave of supply chain attacks over the past few months. That's why we built Bumblebee, a lightweight security scanner that continuously monitors endpoints and hunts for malicious packages.
Bumblebee has been a critical asset in keeping @perplexity_ai secure, and we're thrilled to open source it for everyone.
We're also using Perplexity Computer to monitor public threat intelligence feeds in real time and update the Bumblebee repo as new threats emerge. Excited to share this with the community!
The release candidate for MCP 2026-07-28 is out. The protocol is now stateless: no handshake, no session id, any request can hit any server instance. Plus extensions as first-class (MCP Apps, Tasks), auth hardening, and a proper deprecation policy so we don't have to do this again.
https://t.co/XRLTu1BSkB
Today WorkOS is launching auth.md
An open protocol for agents to register for services on the web.
We're partnering with @Cloudflare and @Firecrawl as some of the first providers.
Why did we build this? And why now? 🧵
Cloudsail: Instant Sandboxes for Coding Agents
Create a new Cloudflare Sandbox for each task with a shell, Codex and GitHub access. Tokens are never exposed to the sandbox. Update your deps far away from your laptop.
npm install -g cloudsail
cs
https://t.co/fI0ajfNnLV
Cloudflare's security team spent the last few weeks testing Anthropic's Mythos against fifty of our own repositories. What we learned about offensive AI, why faster patching is the wrong reaction, and what the architecture around vulnerabilities has to look like next. https://t.co/RSrRtIhgaV
Codiff: A beautiful, extremely fast local diff viewer
I review SO MUCH code locally these days. I asked Codex to build it using https://t.co/4P6iuEJ4sq and https://t.co/FhMJ3wfsa6. Thanks @amadeus and @fat. Amazing software.
It took 16 minutes to build this. It's amazing.
https://t.co/ofxTYTM9OG
A preview for Pro users: a new personal finance experience in ChatGPT.
Pro users in the U.S. can securely connect financial accounts, see where their money is going, and ask questions based on the information they choose to connect.
Your full financial picture, now in ChatGPT.