Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Introducing Vibe Coding XR, a new rapid prototyping workflow that empowers Gemini Canvas w/ the XR Blocks framework to turn user prompts into interactive, physics-aware WebXR applications, allowing creators to quickly test intelligent spatial experiences →https://t.co/suwxBMoMvD
@xAI has acquired @X in an all-stock transaction. The combination values xAI at $80 billion and X at $33 billion ($45B less $12B debt).
Since its founding two years ago, xAI has rapidly become one of the leading AI labs in the world, building models and data centers at unprecedented speed and scale.
X is the digital town square where more than 600M active users go to find the real-time source of ground truth and, in the last two years, has been transformed into one of the most efficient companies in the world, positioning it to deliver scalable future growth.
xAI and X’s futures are intertwined. Today, we officially take the step to combine the data, models, compute, distribution and talent. This combination will unlock immense potential by blending xAI’s advanced AI capability and expertise with X’s massive reach. The combined company will deliver smarter, more meaningful experiences to billions of people while staying true to our core mission of seeking truth and advancing knowledge. This will allow us to build a platform that doesn’t just reflect the world but actively accelerates human progress.
I would like to recognize the hardcore dedication of everyone at xAI and X that has brought us to this point. This is just the beginning.
Thank you for your continued partnership and support.
Wow. Recreating the Shawshank Redemption prison in 3D from a single video, in real time (!)
Just read the MASt3R-SLAM paper and it's pretty neat. These folks basically built a real-time dense SLAM system on top of MASt3R, which is a transformer-based neural network that can do 3d reconstruction and localization from uncalibrated image pairs.
The cool part is they don't need a fixed camera model -- it just works with arbitrary cameras -- think different focal lengths, sensor sizes, even handling zooming in video (FMV drone video anyone?!). If you've done photogrammetry or played with NeRFs you know that is a HUGE deal.
They've solved some tricky problems like efficient point matching and tracking, plus they've figured out how to fuse point clouds and handle loop closures in real-time.
Their system runs at about 15 FPS on a 4090 and produces both camera poses and dense geometry. When they know the camera calibration, they get SOTA results across several benchmarks, but even without calibration, they still perform well.
What's interesting is the approach -- most recent SLAM work has built on DROID-SLAM's architecture, but these folks went a different direction by leveraging a strong 3D reconstruction prior. Seems to give them more coherent geometry, which makes sense since that's what MASt3R was designed for.
For anyone who cares about monocular SLAM and 3D reconstruction, this feels like a significant step toward plug-and-play dense SLAM without calibration headaches -- perfect for drones, robots, AR/VR -- the works!