TL;DR - we improve text-to-image output quality by tuning an LLM to predict ComfyUI workflows tailored to each generation prompt
Project page: https://t.co/mo6Rdpktlf
Paper: https://t.co/p2JbQv7QKH
[1\4]
LazyDiffusion is accepted to #ECCV2024!
Traditional image editing methods regenerate unchanged pixels, wasting time and computation. LazyDiffusion generates only novel pixels while respecting the full image context, and does so up to x10 faster!
https://t.co/NkOLfLacz5
Been playing with ✨B-LoRA✨, and IMO it deserves more attention
key insights-
① 2 unet blocks are crucial for encoding content & style
② LoRA can be used for *implicit* style-content separation, by optimizing these blocks
③ ↑ can be done w/ 1 img
▶️ https://t.co/59uFufZXzH
Dynamic Typography
Bringing Words to Life
Text animation serves as an expressive medium, transforming static communication into dynamic experiences by infusing words with motion to evoke emotions, emphasize meanings, and construct compelling narratives. Crafting
Announcing our new work: https://t.co/jDqurfU4wz 🥳
We improve IP-Adapter by training with a new identity-lookahead loss and synthetic, consistent data.
The key idea is to propagate image-space losses (ID, CLIP) through LCM-LoRA instead of single-step DDPM approximations.
1/5
@YinonMagal שלום ינון. אני בחור בן 31 חילוני. אחרי פסיקת בג״ץ אני יותר שלם לשלוח את ילדי לצבא. לא הגיוני שילדי יסכנו את חייהם ויגנו עלינו ובחורי ישיבה ישבו בבני ברק על חשבוני
Be Yourself
Bounded Attention for Multi-Subject Text-to-Image Generation
Text-to-image diffusion models have an unprecedented ability to generate diverse and high-quality images. However, they often struggle to faithfully capture the intended semantics of complex input