We just launched Sites into Codex!
Software creation was always about more than writing code. Sites in Codex fundamentally gives the power of end-to-end software creation to every user, no matter their technical fluency.
These Sites are fully deployed to a URL, private to workspaces, come with authentication, can have static files, and can store dynamic data in databases.
It is in preview for business and enterprise teams and will be rolling out to all workspaces over the next day. Give it a try by typing @ Sites into Codex and ask it to build anything!
This project took a massive amount of effort across hundreds of people at OpenAI - proud that we were able to get this out and excited to see what you all build with it!
How do you teach a humanoid to assist another person in close-contact? 🤖
The hard part: the two bodies are physically coupled — helper & helped continuously shape each other's motion.
Neither can be solved alone.
Meet AssistMimic, our multi-agent RL framework👇🧵 #CVPR2026
Introducing SpatioSim.
From ideas and real-world captures
to editable 3D Worlds.
Generate. Edit. Simulate.
Built for creators🎬, game makers🎮, and robotics teams🤖.
Beta soon 🧪👀
Excited to release τ0-WM: an open-source unified video-action world model for robotic manipulation.
It's a 5B-parameter robotic foundation model trained on 27.3K hours of real-robot teleoperation, UMI-style demonstrations, and egocentric interaction videos.
I printed a patient's discharge summary, packed with names, IDs, phone, email, and address, then scanned it with my iPhone.
31 pieces of PHI masked on-device. Then it pulled the diagnoses, meds, and doses out as clean data.
No server. No Python. Nothing leaves the phone. 👇
Introducing Project Eden, a world model research preview from @VASTAIResearch
Project Eden is a persistent, multiplayer world model that fundamentally breaks from existing paradigms by decoupling the underlying world state from visual rendering.
Instead of treating the world as a sequence of transient frames, Eden treats it as a structured, evolving environment that runs continuously, can be modified by user actions, and can be consistently observed from any viewpoint.
NVIDIA announces the first open humanoid robot reference design built for robotics research.
The NVIDIA Isaac GR00T Reference Humanoid Robot combines the @UnitreeRobotics H2 humanoid robot, @SharpaRobotics Wave five-fingered hands for dexterous manipulation, Jetson Thor onboard compute, and Isaac GR00T open software and models, giving researchers a full-stack platform from data capture to model deployment.
Read the #NVIDIAGTC Taipei announcement: https://t.co/ZsT3qQKucb
This is THE moment of Physical AI!
We are officially announcing Cosmos 3: Omnimodal World Models for Physical AI 🚀
- Cosmos 3 is an omnimodal world model: within a unified architecture, it can understand and generate language, images, video, audio, and actions.
- It is not just a VLM, not just a video generator, not just an audio-visual generative model, and not just a physics simulator / world-action model. It can understand images and videos, generate images, videos, and audio, simulate future worlds, predict actions, and generate robot policies—enabling models to truly begin to “touch the world.”
- Cosmos 3 is the #1 open-weight reasoner / T2I / I2V / robot policy across many benchmarks.
Huge thanks to every teammate who fought side by side on this journey—from architecture, data, training, infra, serving, and evaluation to post-training. Every part of this project carries an incredible amount of hard work. This was my first time leading a project as Tech Lead, and I feel truly fortunate.
The future of Physical AI needs models that can not only “see” and “describe” the world, but also “imagine,” “simulate,” and “act”—and eventually close the loop with the real world. I hope Cosmos 3 can become an important starting point for this direction, and I’m excited to push Physical AI into its next stage together with the open-source community.
Welcome to the era of Physical AI.
HuggingFace: https://t.co/QW5h5pIWWM
Project Website: https://t.co/Jppa0gkn16
Code: https://t.co/aJgaLm5BaG
CubePart is a text-to-3D generator for game-ready assets. It lets you specify the exact parts you need, so outputs can be animated, scripted, or used in physics.
Free & open source
https://t.co/Wl6ZKhuQRr
This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗
Meet LocateAnything: a vision-language detection model that rethinks bounding box prediction. For AI agents and robots, “seeing” is only useful if a model can pinpoint where something is fast enough to act.
Trained on 138M high-quality samples, LocateAnything decodes bounding boxes in parallel instead of one coordinate at a time, improving localization accuracy while dramatically increasing throughput for visual grounding and detection.
Project page: https://t.co/O7JMe8tzFM
One of the biggest challenges in video to motion is scale!
Most of the time people think of it as distance to the camera, but when you add multiple people the scale between them almost matters more!
At cartwheel - with our newest model - we're really proud to capture distance and character scale accurately
Made a free Pixal3D demo (Tencent's new image-to-3D model) because I like it a lot 🔥
What's interesting: pixel-aligned generation: every point in the mesh ties back to a specific input pixel, so silhouettes, textures, and tiny details actually survive into the 3D asset instead of drifting.
⬇️ Sharing the Hugging Face link
‼️UPDATE: Pixal3D is now under the MIT License.
We hope this makes Pixal3D easier to use, build upon, and adopt for broader research and applications.
Thanks again for your support, and feel free to let us know your feedback!
Today, I ported a Nanite-style tech to Godot for my Gaussian Splatting standard mesh, and the performance is already very good. I haven’t added DLSS yet, which should make it even smoother and help reduce aliasing.
For the Nanite part, I based my work on the PR that appeared three days ago on the three.js GitHub:
https://t.co/DDv23UjwL8
Big news I found a way to convert Gaussian Splatting into a standard mesh while preserving all the visual advantages of Gaussian Splatting, such as reflections, shadows, and lighting that changes depending on the viewing angle.
The result is excellent performance, even on an old laptop or a VR headset. And of course, the mesh can be animated like any regular 3D mesh, while still keeping the visual benefits of GS. I’ll soon show a full-body character animated using this technique.