new in the compound-engineering plugin for Claude Code: /ce-dogfood-beta
hands-off QA for the branch you just built. it diffs against main, builds a full browser test matrix of every change as user journeys, drives the app like a real user would,
then auto-fixes what's broken, adds regression tests, and commits each fix until the matrix is green. you ship. it cleans up after you.
Gemma 4 12B + MTP speculative decoding on mlx-vlm 🚀
We benchmarked MTP on Gemma 4 12B across all 4 modalities in mlx-vlm — and it speeds up everything:
text, image, audio, and combined audio+image, up to 1.72× and 80 tok/s on a single M3 Ultra.
Get started today:
> uv pip install -U mlx-vlm
https://t.co/7BvnEuAikR
Full podcast episode with @rauchg, @maxhodak_, and @bscholl.
40 minutes of unreleased material.
The AI Industrial Revolution
Part 1: Waste Tokens, Save Time
0:00 Three Frontier Founders
1:27 AI Software Factories
4:15 Waste Tokens, Save Time
5:47 Models Instructing Humans
9:29 Is Pure Software Dead?
12:03 You Don't Get Stuck Anymore
Part 2: Vibe Coding Hardware
14:39 Vibe Coding a Turbine Blade
18:07 Open Source Compounds China's Advantage
20:15 You Always Want the Smartest Model
22:44 Software Still Needs Hands
24:43 Humans Are Becoming Verifiers
Part 3: The Regulatory Frontier
27:53 The Regulatory Red Queen Race
32:32 Why There's No Innovation in Healthcare
36:49 We Need a True 50-State Experiment
40:31 China's FDA Is Beating Ours
43:37 Healthcare Is a Communist Society Inside Capitalism
45:57 Sid's Story: N-of-1 Medicine
Part 4: The Autonomous Company
47:49 Autonomous Infrastructure
51:25 Your Job Is to Train the Agent
54:54 The Next Lord of the Rings
59:08 What's Your Definition of Art?
1:05:00 Can AI Have New Ideas?
1:07:03 A Large Number of Small Teams
Just published in @PNASNews, we resolve a 50-year-old riddle from Richard Feynman's handwritten notes, prove and generalize it, and run a large-scale human study to reveal near-optimal heuristics in sequential decision problems:
https://t.co/4AOM1iDqG2
"MAI-Thinking-1: Building a Hill-Climbing Machine"
Microsoft just did something almost no frontier AI lab has done before
They shared how they engineered the data behind a frontier-scale model in unusual depth.
From data collection and eval decontamination, to data mix scaling, this paper lays out how they managed 30T pretraining tokens plus 3.55T midtraining tokens
Surprisingly, they also used no third-party distillation and no open-source training datasets
The model itself is not a jaw-dropping release, but the paper might be the best open look yet at a frontier-scale data factory and hill-climbing loop.
Carina Hong just raised $200M to tell frontier labs their math AGI roadmap is a dead end.
* Perfect 120/120 on Putnam -- beats best human (110) and best LLM, DeepSeek (103)
* 99% on CodeMarina (code + proof) vs frontier LLMs at 3.6-22%
* Built it in 7 months with 30 people, $1.6B valuation
* Her thesis: verification scales brilliance, it does not fix lousiness
* Frontier labs cannot focus long enough to match the formal-math substrate
Full breakdown above. Source: Latent Space, @latentspacepod.
YouTube: https://t.co/IYNjBPqXGN
It appears that OpenAI has moved all Codex users to token-based billing using a "credits" system aligned with API pricing.
Some companies had receive a two-month-long introductory period and are now receiving a limited amount of pooled credits per user.
https://t.co/6gjbPQ5HjH
🦔UC Berkeley's computer science department just posted its worst failure rates in years. 35.3% of CS 10 students got F's in spring 2026, up from under 10% in prior semesters. Professor Dan Garcia says the primary driver is a "vast increase in academic dishonesty" through LLMs. Students use AI to complete assignments, never learn the material, then fail exams. His office hours, once full, are now empty.
My Take
Companies are firing experienced engineers while the pipeline that produces new ones is being gutted by the same technology. Students use AI to bypass the hard part of learning, show up to exams without the understanding, and fail. One professor discovered a student's linear algebra class had an "open AI" policy for homework and exams. That student then couldn't do basic linear algebra in the next course.
Both ends of the workforce are eroding at the same time. Senior engineers are getting cut to fund AI spending. Junior engineers are graduating without the skills because AI did their coursework. And the companies spending trillions on these tools haven't connected those two facts yet.
Hedgie🤗
People are increasingly worried that AI tools make us overreliant.
But how do we actually measure this? We introduce Offloading Score, a measure of reliance based on the fraction of cognitive effort offloaded to AI while completing a task.
In a controlled user study, Offloading Score detects increased reliance under time pressure, while several common alternatives do not.
(1/9)
Highlighting recent advances in multi-GPU and tensor parallel support in llama.cpp
Over the last few months llama.cpp maintainers and engineers from NVIDIA collaborated to improve the multi-GPU performance in ggml. This resulted in significant performance gains on RTX systems and laid the groundwork for hardware-agnostic tensor parallelism in ggml.
For more information on this and other advancements in the low-level inference engine of llama.cpp, check the technical blog by @NVIDIARTXSpark below
NVIDIA Nemotron 3 Ultra is here
We have Day‑0 support for Nemotron 3 Ultra in prime-rl and Lab.
Specialize Nemotron 3 Ultra for your use case.
https://t.co/CrH0gW0oyL
What happens when agents with all possible strategies compete? That's a question for ruliology. With some surprising answers...
https://t.co/5RdL27qQc3
A "compute tax" fails on basically all the basics of optimal taxation.
- Don't tax capital
- Don't tax intermediate goods
- Don't tax easily manipulable things
- Don't tax small tax bases
Overall, a dumb idea.
https://t.co/v3lNfI8anq
🚨I have a new book coming out October 20: Co-Existence!
It is about how we live & work with AIs that are sometimes (but not always) smarter than we are. And it has a cool cover.
You can pre-order: https://t.co/Ti5jo6ksfI
And here is a post with context: https://t.co/YpWvCG4dUD