What if every image in your training set is corrupted, masked, blurred, or compressed, and you don’t have any clean data points? This is often the case in many areas like MRI scans, satellite images, and many datasets in the real world.
Can you still train a diffusion model and recover the clean distribution?
Yes, as long as the corruption channel is known and invertible on distribution level. We introduce DiffEM, a framework to do this with diffusion models. 🧵 (1/n)
FLUX.2's @bfl_ml text tokens aren't just holding your prompt.
During image editing, they absorb reference image content, and some of that absorbed content, like color and style, causally drives the output appearance.
New paper 🧵👇
Impressive work by @danialgorithm building upon DiEM. In DiEM, approximate posterior sampling is elegant but a bottleneck both in efficiency and accuracy. Replacing posterior sampling with conditional diffusion models is a clever and effective solution!
Our work is heavily influenced and inspired by the prior EM-based diffusion methods that you can read about below:
📄 Paper by Rozet et al.: https://t.co/WGGnj38lGd
📄 Paper by Bai et al.: https://t.co/K1C3Oj9TSB
There is also a exciting and closely related paper that came out recently from the amazing NYU folks [Chirag Modi, @JiequnH, Eric Vanden-Eijnden, Joan Bruna]. They use flow-matching, instead of conditional diffusion, to perform the reconstruction step and establish similar theoretical results.
📄 Paper: https://t.co/sx48VnBFHj
🧵(n/n)
What if every image in your training set is corrupted, masked, blurred, or compressed, and you don’t have any clean data points? This is often the case in many areas like MRI scans, satellite images, and many datasets in the real world.
Can you still train a diffusion model and recover the clean distribution?
Yes, as long as the corruption channel is known and invertible on distribution level. We introduce DiffEM, a framework to do this with diffusion models. 🧵 (1/n)