Grok Imagine is not only a family of models, it is an ecosystem for entertainment and intelligence.
👇👇👇Generated by the latest Grok Imagine image and video model
Grok-Imagine-Video-1.5-Preview (720p) has landed #1 in the Image-to-Video Arena!
This is a massive +52 pt improvement over Grok-Imagine-Video (720p), surpassing the best video models Seedance-2.0 and HappyHorse.
Congrats to @xAI and @elonmusk on this big achievement!
🚀 Does high-quality audio generation really require latent compression?
Excited to share our new paper🚀🚀🚀
WavFlow: Audio Generation in Waveform Space.
WavFlow removes the audio tokenizer / VAE bottleneck and performs flow matching directly in raw waveform space.
Instead of generating compressed latent tokens, WavFlow directly models waveform patches end-to-end, achieving competitive or better performance than latent-based methods on both Video-to-Audio and Text-to-Audio benchmarks.
Key ideas:
• waveform patchify
• x-prediction flow matching
• amplitude lifting for stable raw-space training
• large-scale 5M video-text-audio training pipeline
Paper: https://t.co/bvWuPxdH9c
Demo: https://t.co/i5uDLisXpx
Code: https://t.co/xsGxtx8xR3
#AI #AudioGeneration #DiffusionModels #FlowMatching #MetaAI
Pop your headphones on 🎧 , turn on the sound📣, and listen to what our model can do!
Grok Imagine Agent Mode (Beta) just released on grok web
This is not just basic image generation... it’s a full creative agent on an infinite open canvas
Tell it:
- “Generate a 1-minute cinematic film”
- “Create a complete manga set”
- “Build UGC product stories”
…and it plans, generates, edits, and iterates everything in one workspace
You don't need to tab-switching...or starting over
It's just pure agentic flow
This is the biggest leap yet for Grok Imagine
Introducing Quality mode on Grok Imagine – powered by our most advanced image generation model.
Quality mode gives you enhanced details, stronger text rendering, and higher levels of creative control.
Now available on web and mobile.
Try it at https://t.co/zGhs9czkC5
Today we’re launching the Video Edit Arena to evaluate the frontier capability of video models!
- #1 Grok-Imagine-Video, @xAI
- #2 Kling-o3-pro, @Kling_ai
- #3 Kling-o1-pro, @Kling_ai
- #4 Gen4-aleph, @Runwayml
The leaderboard is powered by thousands of real-world community votes. Click the Edit button in Video Arena to edit any video and compare top model outputs.
More models coming soon!
At @xai we are making media creation more accessible, enjoyable and useful. Lots more creative features coming soon!
Thanks to the amazing team that worked around the clock to ship this, and our captains @Guodzh@imhaotian ❤️
Nano Banana has truly redefined what's possible with image generation models, pushing the boundaries of people's imagination when it debuted
Today, we're excited to introduce Grok-Imagine-Image: a new model that's both faster and better than Nano Banana.
Through this journey, we've built many of the essential building blocks needed to unlock the next generation of models and to keep fueling the growth and prosperity of the visual AI community.
Stay tuned... something incredible is coming very soon! But today, hello world, grok-imagine-image!
Understanding requires imagining. Grok Imagine lets you bring what’s in your brain to life, and now it’s available via the world’s fastest, and most powerful video API: https://t.co/tqQwQVgCEI
Try it out and let your Imagination run wild.