๐ข We introduce SurgicaL-CD, a diffusion model to generate realistic, labeled surgical images in one step
๐ I'm excited to share that our work is accepted at the ECCV 2024 workshop on Synthetic Data for Computer Vision
Paper : https://t.co/vErSfHDxPi
1/5๐งต
@arnie_hacker Reading mp4 files directly during training worked better for me. You can try using decord or torchvideo for loading or decoding the video directly on the GPU. But I'm not sure for hour long videos. Create smaller snippets with ffmpeg.
How to make sure this phenomenon is not happening? Maybe this does not happen at a larger scale ? Happy to hear thoughts from diffusion folks
#diffusionmodels#NeurIPS2025
Iโm Shawn, founder of https://t.co/DLaNyXWuUn, former researcher at Meta and CS PhD at University of Cambridge.
Today weโre launching https://t.co/DLaNyXWuUn: we built the worldโs first Large Visual Memory Model - to give AI human-like visual memories.
Why visual memory?
AI to date is chat-based, which has great applications, but humans are not only chat based. Humans have visual-memories, and are visually driven. https://t.co/DLaNyXWuUn enables AI agents, softwares, and even robots to see and remember the world the way humans do.
We just raised an $8m seed round led by @susaventures to build the visual-memory layer for AI.
Why is https://t.co/DLaNyXWuUn so groundbreaking? ๐