Co-founder&General Partner: The Venture Reality Fund @TheVRFund: Investing in Spatial AI & Physical AI 🤖||| @stanford @carnegiemellon @SJSU @NPS_Monterey
Here’s the book to read for MPC by Profs Bryson “The Father of Modern Control Theory” & Ho
@stanford@StanfordEng
Applied Optimal Control https://t.co/lO2pwC7JZX
ByteDance has published a paper that should make every NVIDIA investor sweat.
They trained an AI that writes CUDA better than humans experts.
They call it CUDA Agent.
And it completely rewrites the economics of AI hardware.
They built a massive agentic reinforcement learning loop. The AI writes a kernel, compiles it, profiles the hardware, analyzes the bottlenecks, and rewrites the code until it's flawless.
It learned how to optimize memory access patterns and hardware tiling strategies that traditional compilers miss.
The results are staggering.
On the industry-standard KernelBench, CUDA Agent completely destroyed traditional compilers.
It delivered code that runs up to 3.2x faster than PyTorch's native execution.
On the hardest, most complex models, it beat the strongest proprietary models in the world—including Claude Opus 4.5 and Gemini 3 Pro, by 40%.
It didn't just match human experts. It started discovering optimizations that static compilers literally cannot see.
Here is why this is a massive threat to NVIDIA.
NVIDIA's dominance relies on the fact that CUDA is incredibly hard to master. Developers get locked in because optimizing code for other chips is too painful.
But if an AI agent can autonomously generate hyper-optimized hardware kernels...
You don't need a team of $500k a year CUDA engineers to build world-class infrastructure.
And if an AI can autonomously master CUDA, it can master AMD's ROCm. Or custom silicon.
The impenetrable software wall protecting NVIDIA's monopoly just got breached by a reinforcement learning loop.
If anyone can automatically squeeze maximum performance out of any chip...
Hardware becomes a commodity.
🌟Your static 3D world models are now alive and interactable!
🚀Introducing NeuROK, a neural simulation framework that turns any static 3D object into an interactive 4D asset — no per-category physics, no physical annotations for training.
📄 https://t.co/PSAILjHmZb
🧵 1/n
I'm excited to announce that the Morpheus AI team is joining Roblox!
Over the past two years, I’ve focused on developing the foundational architectures behind modern video world models, including Self Forcing and AR-DiT. This work unlocked something unprecedented: the ability to move beyond offline, pre-rendered AI video generation and instead simulate interactive worlds in real time. Realizing the massive potential of this technology is what drove me to found Morpheus in August 2025. In the months since, our incredible team has pushed those boundaries further than we ever thought possible.
We've always believed video world models will reshape how games are created. Roblox Reality is an ambitious bet on that exact future, and it lines up perfectly with what we set out to do: bridging the gap between deterministic game engines and generative world models. Joining Roblox means our technology will help power experiences that reach millions of players every day.
To our team, to @a16z and other investors, and to the advisors, partners, and supporters who believed in this from the very beginning — thank you.
We're just getting started. Excited to build this at scale.
Introducing Project Eden, a world model research preview from @VASTAIResearch
Project Eden is a persistent, multiplayer world model that fundamentally breaks from existing paradigms by decoupling the underlying world state from visual rendering.
Instead of treating the world as a sequence of transient frames, Eden treats it as a structured, evolving environment that runs continuously, can be modified by user actions, and can be consistently observed from any viewpoint.
Introducing Cosmos 3: Our latest frontier model for Physical AI
Cosmos 3 is the world’s first fully open omnimodel with native vision reasoning, world and action generation.
Today we’re releasing Super (32B) and Nano (8B) variants.
🐾 @saturdayrobotic Robotics & World Models Reading Club 10 Recap: From Platform → Instincts → Real-World Learning: Roadmap to 🐱Cat-Level Humanoid Intelligence
Keynote: Bringing Robots to Life — Learning Humanoid Instincts from the Body Up, by @HaochenShi74 (@Stanford PhD, adv. Karen Liu & Shuran Song), presents a full-stack humanoid loco-manipulation program spanning hardware, learning, and real-world deployment. Hosts @junfanzhu98, @aurorafeng_01.
🤖 Stage 0: ToddlerBot Platform (ML-compatible embodiment)
Open-source humanoid designed for learnability, not just capability. 30 DoF full-body design (arms/legs/torso/head), dual grippers, 2× fisheye cams, IMU, mic/speaker, Jetson Orin NX, 2–5h battery. Spur/bevel/linkage transmissions. Core idea: hardware is sufficient; bottleneck is learning. Key enablers: exact URDF digital twin, zero-point calibration, motor system ID capturing friction/backlash/controller response, and full actuation model (torque–velocity limits). Teleop via joystick + VR. Sim2real depends on physics-accurate sysID, not kinematics alone.
🧠 Stage 1: Instincts (survival layer)
Locomotion: keyframes → RL w/ domain randomization → vision skill planner (depth+IMU) 3.1Hz. Policy: 3-layer MLP 50Hz (low natural freq).
Motor Current-based Compliance (MCC): no force sensors. External wrench inferred from motor current/voltage + Jacobians + motor model → spring-damper correction. Works across whole body, any contact. Diffusion policy (200 demos, ~80% success), OCHS servoing (21° vs 2–3°), LEAP hand VLM skills, heart-drawing ablation shows wrench estimator key. Generalizes across robots (Unitree G1 etc.), framing compliance as embodiment-level safety primitive.
Energy autonomy: self-charging docking for continuous operation.
🌍 Stage 2: Real-world learning (RTR)
Robot Trains Robot replaces humans with a robot-arm teacher: reward via F/T sensing, XY compliant support, Z-axis curriculum withdrawal, perturbation, failure detection, and auto-reset. Enables safe real-world RL without humans. Key method: latent dynamics gap z optimized from real rollouts + FiLM-conditioned actor/critic. Demonstrated on walking & swing-up from scratch.
📊 Key insight: highest-value data = real exploration (expensive, scarce). Must survive to collect it → instincts are prerequisite for data flywheel.
🧩 Big picture
Platform = learnable embodiment (URDF + sysID + teleop)
Instincts = survival (locomotion + MCC compliance + autonomy)
RTR = self-scaling real-world data engine
⚠️ Reality gap: sim2real still hard for manipulation; locomotion works better. No established robotics scaling law; foundation model form remains unclear.
🐱 Summary: Simulation starts robots. Instincts keep them alive. Real-world experience makes them intelligent.
FWIW, I don't view Ray and Slurm as alternatives to each other, I think of them as solving different problems, e.g.,
Slurm is responsible for sharing compute resources among multiple workloads and multiple users. It provides workload multitenancy, queuing, prioritization, preemption, etc.
Ray is an actor framework and provides a distributed runtime for a single workload. It provides a single-controller programming model for distributed workloads, manages & coordinates processes, handles failures, etc.
It's very natural to run a Ray workload on top of Slurm, similar to how you'd run a Ray workload on top of Kubernetes.
Today Reactor is coming out of stealth. We’ve raised $59M in Seed and Series A funding, led by @lightspeedvp, with participation from @AmplifyPartners, @wndrco, @Sky9Capital, and @FPVventures.
Reactor is the platform for building in the World Model era: the infrastructure that lets developers build with them at global scale for the first time. Stream from a frontier World Model to your app, in real time, all in under 10 lines of code.
World Models represent the next major shift in AI: pixels, audio and actions are generated on the fly, in real-time, in response to user inputs, and to the environment. Every time computing has made a shift from passive to interactive, entire industries appeared that didn't exist before. We're standing in front of such moment again.
Over the last 6 months, we’ve assembled an all-star team with alumni from Apple, Meta, Google, Luma AI, Netflix, and Replicate. We're already partnering with some of the biggest names and labs in the world, and hundreds of developers are already building on Reactor.
The World Model era starts now.
🔥 We release Gamma-World from @nvidia — a generative multi-agent world model that finally goes beyond 2 players.
⚡ 24 FPS real-time streaming
🧩 Simplex Rotary Agent Encoding — permutation-symmetric
🌐 Sparse Hub Attention: O(N²) → O(N)
🎯 2 → more players, zero-shot
🤖 Games → real multi-robot worlds
━━━━━━━━━━━━━━━
💥 THE SINGLE-AGENT ERA IS OVER. 💥
━━━━━━━━━━━━━━━
🔗 https://t.co/TuyX2d2XuT
We are back again :) After three weeks of quiet building.
Introducing Genesis World 1.0, our latest simulation platform, the second release in our full-stack suite. Open-sourced.
Robotics is still bottlenecked by the 1× speed of the physical world. Every model, checkpoint, and data recipe eventually needs to be tested on physical hardware, slowly, expensively, and with limited coverage.
One hour in reality can become 100 days in simulation. That is how robotics model iteration moves from a wall-clock bottleneck to a compute problem.
To make this work, simulation has to be both fast and trustworthy.
Over the past year, we rebuilt the entire stack: a GPU-accelerated cross-platform compiler, penetration-free multi-physics contact solvers, unified rigid and deformable physics, and a photo-realistic renderer purpose-built for physical AI applications.
We built Nyx, a high-performance path-traced rendering engine for robotics application.
Genesis World 1.0 achieves near realtime performance with our latest development for penetration-free IPC solver, supporting various types of deformables beyond rigid bodies. It supports contact-rich, dexterous manipulation simulation across different embodiments: unitree, sharpa, wuji, genesis hand and various types of grippers.
Under the hood is Quadrants, our effort in pushing forward cross-platform GPU-accelerated computation. Quadrants started as a fork of Taichi, and we rebuilt most of the critical parts for optimizing simulation workloads, giving 10x faster launch time and up to 4.6x runtime performance compared to the initial Genesis release.
Together, they bring us to an unprecedentedly low sim-to-real gap, enabling zero-shot real-to-sim model evaluation and much faster iteration of GENE.
All available today.
Genesis World 1.0: https://t.co/aknCM3eqws
Quadrants: https://t.co/uXqPNI4cb6
Nyx: https://t.co/R8j0djqGnV
🔻Excited to share our new work: TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
🚀Sparse pose-free images → Simulation-ready triangle mesh in 0.57s. No TSDF. No Poisson. Drop into Unity / Isaac Sim.
🎮 Interactive demo: https://t.co/aVF2McLU5a
What if your phone could generate or edit images in under a second?
ByteDance’s Intelligent Creation Lab presents DreamLite, a compact 0.39B parameter model that unifies text-to-image generation and editing in one network. It uses a simple trick: concatenating images side-by-side as input — blank for generation, source for editing.
DreamLite achieves GenEval 0.72 (generation) and ImgEdit 4.11 (editing), outperforming existing on-device models and rivaling server-side systems. With step distillation, it runs a 1024×1024 image in less than 1s on a Xiaomi 14.
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing
Paper: https://t.co/Pg34jShWhg
Project: https://t.co/RKA1pc7k3O
Code: https://t.co/uecgs0YeiT
Our report: https://t.co/VGlEopobcF
📬 #PapersAccepted by Jiqizhixin
The Center for Decoding the Universe teams up with @APSphysics's Group on Data Science and @NeurIPSConf's Machine Learning and Physical Sciences Workshop team to present the 2026 Conference on Physics and AI (PAI26). Learn more about this year's agenda: https://t.co/eCXeSzBtNo
Major theatrical releases coming to home audiences in 3D via headsets is a win all around for XR. Apple and Vision Pro is leading the charge on this right now, but I really hope it opens the door to 3D movies getting released widely across all major headsets.
Gemini Omni doesn't just build scenes that look real, it reasons about what should happen next. It combines an intuitive understanding of physics with Gemini's knowledge of history, science, and cultural context.
Rolling out today starting with video outputs to Google AI Plus, Pro and Ultra subscribers globally through the @Geminiapp + Google Flow, and @YouTube Shorts this week.
Real-world models are here! Stoked to share how we're bringing real-world locations to life by integrating Street View into Genie. Try it now at https://t.co/j6c1N38tRS and read the blog for more info:
https://t.co/6ZOi9d9rah
You can’t lift a fridge with just your hands. Your whole body needs to conform to its shape, and bear the load between your arms and torso.
Here, @BostonDynamics' Atlas uses proprioception to manage the whole-body interaction and adapt to a shifting 100+ lb load. Enabling this type of high performance manipulation is exactly why we walked away from what was arguably the world’s best implementation of MPC for humanoids, and shifted entirely to RL without looking back.
This level of whole-body controls is a fundamental building block of physical intelligence and key to the value proposition of humanoids.
More technical details in:
Blog: https://t.co/oIRjVfh7jJ
Behind the scenes video: https://t.co/LgaImMAyhX
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.