This work was done at @AsariAILabs with Armando Solar-Lezama, @yisongyue, and @StephanZheng.
Paper: https://t.co/KHjo5Uidmj
Blog: https://t.co/4IXXqL3ZYg
MIT News: https://t.co/J9Z19Ljt4q
Code: https://t.co/d0ehsvPBoq (open-source implementation by @nitin966)
As part of last year’s mentorship program with @lmthang, @YiTayML and others, our small team (Dhruv Agarwal, Colin Snyder and myself) worked on improving Sakana AI’s AI Co-Scientist — focusing on reliability, reducing hallucinations, and generating more complete research papers from hypothesis to draft.
We extended the work into a version aimed at biotech researchers and were invited to present it at GStar Summit (the AI & Humanity event in Ho Chi Minh City). Dhruv presented our benchmarks showing clear gains over the base system on code execution, citation quality, and hallucination reduction.
None of it would have been possible without the program’s resources (including significant compute) and feedback from researchers at Yale, Stanford and elsewhere. Grateful for the experience and the people who made it possible.
Honored to win 1st place at South Asia’s biggest AI conference and be among the three teams invited to present our work.
We built an AI Co-Scientist for Biotech Researchers that beat the Industry leader by 40% and a Virtual Cell model that is currently top 5 globally in benchmarks.
None of this would have been possible without the mentorship of @lmthang and @HoangDuon, and I am glad to have them as my advisors.
MAI-Thinking-1 is out!
Excited to share what we are building and how climbing from scratch (no distillation) actually works: simple recipes, rigorous science, self-distillation, patience, and great infra.
Check out our tech report has the full story of our RL climbs.
https://t.co/aLW40sWz4d
🚀 Excited to release mKernel: a set of fast multi-node, multi-GPU fused kernels.
💻 Code: https://t.co/y2WfdMVTfC
📝 Blog: https://t.co/wGomxmeRxr
mKernel fuses compute + communication into one persistent GPU kernel, covering both intra/inter-node with GPU-initiated communication.
Amazing team: @yangzhouy, Chon Lam Lao, Costin Raiciu, Scott Shenker, @istoica05
Omar Khattab’s lab at MIT strikes again!
Pedagogical RL -
Today, RL relies on pure entropy to sample new trajectories. This is pretty inefficient, and also caps performance at what is already within stumbling range of the current model.
In almost every RL setup, we’re leaving valuable information from the judge or reward signal on the table, that could drastically improve sampling.
The solution, naturally, is to roll out a teacher model with privileged information. The problem is, this will be so off-policy from the model and could be cheating, that training will collapse.
Thus, to fully achieve this objective, the team defines a new spike-aware learnability reward - disproportionately penalizing high-surprise tokens from the students perspective, to RL train a teacher. AND they also surprisal-gate in the eventual loss function when reaching the student on these generated trajectories.
Altogether, this leads to significantly faster convergence than any other method compared (GRPO, OPSD) which is quite impressive.
Authors: @SOURADIPCHAKR18@NoahZiems@furongh@Meng_CS@amritsinghbedi3@lateinteraction
Excited to share our latest work: Autogenesis 🧬
The next agent stack will be evolvable.
Prompts, tools, memory, environments, and agents become versioned, auditable, rollback-safe resources.
A step toward self-improving AI infrastructure. 🚀
📄 https://t.co/CY0Rl2LYn7
💻 https://t.co/PTZLrVNJj7
The web is a living museum of human curiosity: messy, beautiful, surprising, and constantly changing.
Really enjoyed Museum of the Human Web, organized by @p0. A thoughtful reminder of the people, artifacts, ideas, and accidents that shaped the modern internet and how we share knowledge.
branchfs 0.1.1 is out, and it now runs on macOS.
branchfs is a FUSE filesystem with atomic, copy-on-write branching. You take a workspace, fork it instantly, run something speculative against the branch, and either commit the result into the parent or throw it away. No copying gigabytes, no cleanup scripts, no "let me reset my environment" detour.
This is built for the way AI agents actually work. An agent should be able to try an approach, see if it holds up, and abandon it cheaply if it does not. branchfs makes that branch-and-discard loop a filesystem primitive instead of something you bolt on afterward.
Highlights in this release:
- macOS support. branchfs started on Linux, and it now builds and runs on macOS too, so you get the same branching workflow on your laptop. Thanks to @nitin966 for the port.
- FUSE passthrough for near-native read and write I/O.
- Higher throughput from file descriptor caching on the hot read and write paths.
- Symlink and rename support across the FUSE layer, copy-on-write, and branch operations.
- Immediate-parent merge on commit and abort, plus @branch virtual directories for inspecting branch state.
- Storage safety: --max-storage quota enforcement, branch name validation against path traversal, and graceful daemon shutdown.
Prebuilt binaries for Linux x86_64 and macOS arm64 are attached to the GitHub release, and the crate is on https://t.co/dV7mBpdm1Z, so "cargo install branchfs" just works.
I would genuinely like to hear what breaks for you. If you are building agent infrastructure, speculative execution tooling, or just want fast disposable workspaces, give it a spin.
If you are on a Mac, I would especially love for you to try it. The macOS port is fresh, and real-world testing on different setups is the fastest way to shake out the rough edges. Install it, branch a workspace, run something speculative against it, and tell me how it goes.
Repo and releases: https://t.co/oHeYvj0dz3
#rust #filesystem #fuse #aiagents #opensource #macos
We're not over with scaling. The path to million chips run is clear.
You need to be distributed across many datacenters.
You need to be resilient to hardware failures.
We simulate extreme levels of hardware failures as if we were using 2M chips and show Decoupled DiLoCo can achieve the same ML performance while having superior infra robustness!
Big moment for RL!
ART (Agent Reinforcement Trainer) is an open-source framework for training agents with GRPO + RULER (an automatic reward system).
No need to hand-craft reward functions.
GitHub: https://t.co/rONo67QHuG