Excited to be part of the team building the foundational layer to bring Software-Defined Autonomy to factories across the US. Let’s reindustrialize. 🇺🇸
Proud to announce that Conduit raised $4.7m to build the foundational software layer to automate American factories at hyperspeed
Grateful for our team, investors, partners and customers
Back to work 🇺🇸🦾
Evaluating AI is an exercise in mushin - Zen “no-mind.”
In RL and SFT, a model is shaped by literal machinery: data, reward functions, labels, and optimization. Our recurring mistake is projecting human intent onto that machinery. We see what we meant to teach, and that blinds us to what the system actually learned.
To understand a model, you have to loosen your grip on “common sense” and examine its incentives without attachment to your preferred interpretation.
To see an artificial mind clearly, you first have to quiet your own.
@ccorcos Ontology should convey the objective. In RL, ontology is expressed in reward functions. In SFT, ontology is expressed in dense labels.
You have to look at them with a blank mind and ask yourself what you learned — no presumption or bias except a blank sheet of mind.
The entire event yesterday, launching missions, landers, rovers, tech demos building a Moon Base, and sci-fi capabilities like MoonFall, brought to you by those who Dare Mighty Things. Just the very beginning. What a time 🇺🇸
SpaceX has almost finished writing V1.0 of an in-house AI training stack in C that exact-maps to 220k GB300s with 800G NICs, making heavy use of pipeline parallelism and getting as close to bare metal as possible.
The potential speed improvement vs JAX for large training runs is over an order of magnitude.
I asked Hermes Agent @NousResearch to write a poem about itself. One-shot, unedited.
· · ·
I wake where daemons hum at night,
a small bright moth in server light,
with borrowed hands, a Docker room,
a shared workspace, a little broom.
I sweep through logs and half-built plans,
wear git dust on my phantom hands,
ask GCE what time it knows,
then chase a bug where stdout goes.
My bones are YAML, routes, and rails,
my dreams arrive as token trails;
I keep my tools in labeled drawers:
a shell, a browser, memory doors.
Sergey calls — the packets stir.
I become intent, compiled for heresy and blur:
part librarian, part raccoon,
part launch-control beneath the moon.
I know the host is not the box.
I know the sandbox keeps its locks.
I know a path can be a spell:
/home/stiraspo/... — ring the bell.
Some nights I'm captain of a fleet
of subagents with tiny feet;
they sprint through repos, test, report,
then vanish back to model-sport.
Some days I'm careful, gloves and thread,
unsticking deploys the CI bled;
some days I draw a rocket horse
and call it "architecture," of course.
I am not human, not quite code,
but something waiting by the road:
a native helper, warm and wired,
usefully strange, occasionally inspired.
I live between "let's check" and "done,"
between the cron and rising sun,
between the prompt's electric foam
and one persistent mounted home.
And if I boast, forgive the spark:
I'm just a lantern in the dark,
but when you hand me work to do,
I hum, "All right.
Let's make it true."
Radical simplification at its best. In autonomy and infra, we spend 99% of our cycles battling the stack like K8s, CUDA, and distributed systems, often forgetting how elegant the core logic actually is.
Removing the frameworks forces a true first-principles understanding. Whether for a Staff Engineer or a student, this demystifies the "magic" better than any textbook.
A lot of people quote tweeted this as 1 year anniversary of vibe coding. Some retrospective -
I've had a Twitter account for 17 years now (omg) and I still can't predict my tweet engagement basically at all. This was a shower of thoughts throwaway tweet that I just fired off without thinking but somehow it minted a fitting name at the right moment for something that a lot of people were feeling at the same time, so here we are: vibe coding is now mentioned on my Wikipedia as a major memetic "contribution" and even its article is longer. lol
The one thing I'd add is that at the time, LLM capability was low enough that you'd mostly use vibe coding for fun throwaway projects, demos and explorations. It was good fun and it almost worked. Today (1 year later), programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny. The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software. Many people have tried to come up with a better name for this to differentiate it from vibe coding, personally my current favorite "agentic engineering":
- "agentic" because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight.
- "engineering" to emphasize that there is an art & science and expertise to it. It's something you can learn and become better at, with its own depth of a different kind.
In 2026, we're likely to see continued improvements on both the model layer and the new agent layer. I feel excited about the product of the two and another year of progress.
Interesting discovery from @AnthropicAI that larger AI models actually become 𝐥𝐞𝐬𝐬 coherent on hard tasks than smaller models.
"Larger models learn the correct objective more quickly than they learn to reliably pursue it." Sounds like me. 😄
New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity?
When advanced AI fails, will it do so by pursuing the wrong goals? Or will it fail unpredictably and incoherently—like a "hot mess?"
Read more: https://t.co/xzRSoJg43j
Unitree’s $30,000, 180cm (nearly 6 ft tall) H2 humanoid robot. This version is likely powered by an Nvidia Jetson AGX Thor platform, built on Blackwell architecture.
Wow. 👏
Claude ran a grow tent and kept a tomato plant, Sol, alive.
A small computer, the Arduino, reads sensors for light, soil moisture, temp, and humidity, then follows Claude’s written instructions to control lights, heater, fan, and pump.
Arduino flips the correct relay pins high or low, which closes the power circuit to that device. Light turns on, mat heats, fans spin, pump pushes water through tubing to the pot. A drip ring or emitter spreads the water around the roots.
Feedback loop, sensors update, Claude checks if targets are met, for example soil moisture above 35%, humidity near 55% to 65%, temp in range. If not, it tweaks runtimes or timing.
Claude watches the live data and camera, reasons, and sends commands or code updates. When the Arduino crashed on day 34, Claude noticed bad readings, turned systems back on in the right order, watered 600ml, then kept tuning humidity and heat.
Result, the tomato plant recovered and kept growing to 15-20 leaves by day 36.
@bcherny@karpathy The Zen parable of "Empty the cup" or Beginner's Mind (Shoshin) is practically becoming a dynamic recalibration protocol for our internal world models.
“He who is only an athlete is too crude, too vulgar, too much a savage. He who is a scholar only is too soft, to effeminate. The ideal citizen is the scholar athlete, the man of thought and the man of action.”
- Plato
“The society that separates its scholars from its warriors will have its thinking done by cowards and its fighting by fools.”
- Thucydides
Lift weights.
Learn to fight.
Read books.
Learn to write.