New from SpAItial. You can now:
- Create 3D world directly from the Claude desktop app using the new SpAItial AI plugin
- Create worlds with your agents using the SpAItial MCP. Works for Claude Code, Codex, OpenCode, VS Code, Windsurf etc
Create, edit, refine your worlds without leaving your chat! Links below 👇
The best part: it's all open source (GPLv2). Repo + local setup docs here 👇
https://t.co/PS29fVwTON
Build a map from your own .spz and run the whole stack locally. Point Cursor/Claude at the README and it'll walk you through setup.
Photoreal walls. Real geometry. Bots. WebRTC and all open source.
Here's how built a fully multiplayer, browser-playable Quake III arena where every level is a 3D world created from a single Image using Echo-2 🧵👇
Enter the Backrooms - in 15 minutes
🪄 Upload a movie still to @SpAItial_AI Echo-2
⏬ Download a 3D Gaussian splat PLY file
🥽 Upload to @PlayCanvas SuperSplat and enter VR
[1 / 3]
We added Quake3-style multiplayer to our 3D world generator and it changes the game, literally 🔥🔥🔥
Runs directly on our model output: fast movement, arena combat, and every single match plays out on a generated world.
Stay tuned for public release and play yourself!
Available at https://t.co/lVStEkoXPq and can be used via the API and in the app.
Take a look at some example scenes we generated:
https://t.co/5Bhg2VybiT
https://t.co/0mliuG0drx
https://t.co/QJFxmG7Z6s
Releasing Echo-2 HQ, an improved model that delivers greater detail and sharper results.
You can zoom in super close and discover remarkable appearance fidelity. Available via API and in the app. Try it out!
Check out some of the scenes below👇
3D world models are mostly static - not anymore🔥
We're building physically-grounded worlds with dynamics.
For now, we use geometry from our model in a physics engine. However, in the future, we will support native physics, where the model itself becomes the physics engine.
I just one-shotted a 3D world with proper collision physics using the @SpAItial_AI API and @playcanvas
Here's how i did it:
-> Install the spaitial-playcanvas-world skill: npx skills add spaitial-dev/spaitial-playcanvas-world
-> Open Cursor or your preferred coding agent
-> Paste the following prompt or use your own:
"Using /spaitial-playcanvas-world skill and the spaitial api key provided build a PlayCanvas game from this world prompt:
An empty monumental desert sci-fi palace interior with colossal stone arches, sand-dusted floors, bronze industrial machinery, filtered sunlight through high slit windows, carved geometric walls, long ceremonial corridors, immense scale, cinematic warm shadows, ancient-futurist architecture, quiet unoccupied environment. No people, no humans, no characters."
This skill covers World and Collision mesh generation using the SpAItial api, and viewer + physics using Playcanvas.
Introducing the SpAItial API to help you build immersive 3D worlds programmatically.
You can now create world from text, image url, uploaded image, and 360 panorama with code. Track generation status, download results, export meshes and more. ⬇️
Want to build models that generate amazing 3D worlds? We're hiring at SpAItial AI🚀🚀🚀
Looking for a range of roles RE/RS:
- Diffusion & Transformers
- Large-scale Training / Gen AI
- 3D Graphics / Modeling
- ML & Cloud Infra / GPUs
Reach out to us - more details👇
✨App update: 3D worlds can now be created from 360 panoramas!
Ideal for digital twinning -- makes sure that environments get faithfully reconstructed.
Try it out: https://t.co/kMlQCU6992
We wired 3D Gaussian Splatting into the 1999 Quake 3 engine - it's fully playable!
Echo-2 generates the game levels, and it's rendering spaces directly inside the open source version of id Tech 3.
Anyone wants to try it?
We❤️Gaming 💕
AI-generated first-person shooter made easy: input an image -> Echo-2 generates world -> let's play!
We're already having fun playing but it'll be open soon :)
Welcome Echo-2 🚀
I’ve been using this for the past few weeks — honestly pretty wild what the team has built.
Excited for the rest of you to try it.
One of my generations 👇
🚀Echo-2 is here - our new world model!
These aren’t videos. These are 𝟑𝐃 𝐬𝐜𝐞𝐧𝐞𝐬. Generated from a single image.
- Stunning visual quality.
- Real-time rendering.
- Interactive camera control.
- Physically grounded.
🧵More details👇
Large foundation models have made enormous progress in modeling language, images, and video. These systems can generate highly realistic outputs and capture complex statistical structure in data. However, they still operate on projections of the world, text sequences and 2D pixel grids, rather than the world itself.
The real world is not a sequence of text tokens or frames; the real world is inherently anchored in 3D metric space, and dynamics across time. Objects occupy space and persist over time. They interact according to physical laws. Any model that aims to support real-world intelligence, e.g., for robotics, simulation, design, or spatial computing, must capture this structure.
This is where current approaches fall short. While most video models can generate visually plausible frames, they often lack a consistent notion of the underlying scene due to limited context windows. As a result, geometry drifts, scale is ambiguous, objects appear and disappear, and interactions are not physically grounded. The model produces superficial appearance without a persistent world representation.
For many downstream applications, this is not enough.
The first step toward addressing this is modeling 3D space and keeping it consistent. A model should recover a coherent spatial representation of the scene, including layout, geometry, and scale. This not only allows the environment to be rendered from new viewpoints but also, more critically, reasoned about in metric space. If a model cannot produce a stable 3D representation, it is not grounded in the physical world, and it will fail to model the world due to its inefficient contextual memory.
However, 3D is only the beginning.
A truly useful world model must also be temporally and physically consistent. It should not only reconstruct a scene, but also simulate it, predicting how it evolves, how objects interact, and what happens under intervention. Eventually this requires moving beyond static representations toward models that capture dynamics and causality.
I believe that generative approaches are highly compelling in this context, as they can be trained on large-scale data in a self-supervised fashion. In particular, comprehensive 3D world modeling is a highly-promising path forward, since richer environmental representations directly enable deeper and more effective learning of physical reality. Crucially, such generation enforces consistency: for instance, to generate a scene across viewpoints, a model must implicitly recover its underlying 3D structure. To generate it over time, it must capture its dynamics. This forces the model to internalize the latent state of the world, including geometry, scale, materials, motion, and physical behavior.
This also highlights a limitation of purely abstract representations. High-level embeddings or action-centric models can be effective for specific tasks, but without the ability to model and simulate the world, they will eventually remain incomplete. They compress observations, but do not fully model the underlying process that generates them.
The next generation of AI systems should therefore move beyond text and pixels, and toward physically-grounded world models: models that represent space, maintain consistency over time, and enable simulation and interaction.
This is the missing layer between the physical and digital world, which will ultimately enable AI systems not just to observe the world, but to understand and operate within it.
📢Lookalike3D: Seeing Double in 3D
@chandan__yes enables holistic, instance-consistent 3D object reconstruction & part segmentation by detecting identical and near-identical objects from multiview images.
Built on a dataset of 76k curated object pairs
https://t.co/0wqvUG2cld