Microsoft just dropped Trellis.2 — a 4B model that converts any image into a 3D asset in 3 seconds.
Textured mesh under 100ms on CUDA, outputs a GLB file ready for Blender, Unity, and Unreal.
Open source. Any image. 3 seconds.
THE 3D ASSET PIPELINE JUST CHANGED FOREVER.
This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗
Meet LocateAnything: a vision-language detection model that rethinks bounding box prediction. For AI agents and robots, “seeing” is only useful if a model can pinpoint where something is fast enough to act.
Trained on 138M high-quality samples, LocateAnything decodes bounding boxes in parallel instead of one coordinate at a time, improving localization accuracy while dramatically increasing throughput for visual grounding and detection.
Project page: https://t.co/O7JMe8tzFM
One of the more interesting and thought provoking research papers I've seen in a while. A system for reading and reimplementing NeRF papers, and it seems to work very well. Pretty easy to extrapolate out from here to what CVPR 2027 papers will look like. https://t.co/gokzG27mIT
Our part-aware 3D generation work, OmniPart, is accepted by Siggraph Asia 2025. Code and model released!
Paper: https://t.co/vEAyV5kqD2
Project page: https://t.co/ovnAysSa7I
Code: https://t.co/dToyRki7R8
Demo: https://t.co/9gcBmo2NdP
In this iteration, I’ve added @AIatMeta’s SAM 2 to our workflow to segment people. Lisa and I had fun testing it in our lab—striking poses in sequence, with our bodies precisely isolated for a refined Stable Diffusion effect.
#MR#AI#Passthrough@MetaforDevs
Mixed Reality + Diffusion prototype as a tool for exploring concepts, styles, and moods by transforming real-world surroundings into alternate realities.
#MixedReality#MR#AI#StableDiffusion#Quest3
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Contributions:
• We introduce Diffuman4D, a novel diffusion model that generates spatio-temporally consistent and high-resolution (1024p) human videos from sparse-view video inputs.
• We propose a sliding iterative denoising mechanism that enhances both the spatial and temporal consistency of generated long-term videos while maintaining efficient inference.
• We design a human pose conditioning scheme to enhance the appearance quality and motion accuracy of generated human videos.
• We plan to release our processed version of the DNA-Rendering dataset, which we believe will benefit future research in this area.
What if you could rebuild London from scratch? @EpicGames has teamed-up with @ZHA_News to create a photorealistic version of the city that anyone can explore and edit 👉 https://t.co/02XbpW8iUi
Built as an island in Fortnite, Re:Imagine London lets players adapt the urban fabric of the city and even build new developments, rewarding them for good design and use of space.
#AD #architecture #construction
Gordon Ramsay in the kitchen making a mess... AI-generated video 😂 For the first few seconds, I thought it was really him, and then the AI hallucination kicked in.
This is from the new Minimax model.
Credit : r/u/love1008
Sora is our first video generation model - it can create HD videos up to 1 min long. AGI will be able to simulate the physical world, and Sora is a key step in that direction. thrilled to have worked on this with @billpeeb at @openai for the past year https://t.co/p4kAkRR0i0
Put powerful CFD analysis into the hands of the whole team. With Automate, empower everyone to contribute at an expert level. Join our closed beta programme today: https://t.co/j8VHdSe3cF