I'm happy to share that I have joined @Apple in Seattle for my summer internship, working on Image Editing.
Looking forward to a great summer of research and building exciting things!
🚀 Hello, Kimi K2 Thinking!
The Open-Source Thinking Agent Model is here.
🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%)
🔹 Executes up to 200 – 300 sequential tool calls without human interference
🔹 Excels in reasoning, agentic search, and coding
🔹 256K context window
Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns.
K2 Thinking is now live on https://t.co/YutVbwktG0 in chat mode, with full agentic mode coming soon. It is also accessible via API.
🔌 API is live: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/n7xxaszqzF
🔗 Weights & code: https://t.co/4ukcXB0iP6
🚨FLUX.1 Kontext [dev], the open image editing model, is now available at fal with training capabilities!
✨4x faster inference (2s vs 7s)
💰 Ultra-affordable at $0.025/megapixel
🔧 Full LoRA training support
🖌️Game-changing image editing capabilities
https://t.co/yX8wxdFlD4
Today’s @CVPR research is proving that precision isn’t just about better metrics—it’s about building AI that can truly understand complexity, context, and the unknown.
In part 3 of our #CVPR series, we’re highlighting 4 papers that push the boundaries of segmentation, detection, retrieval, and prediction in powerful new ways. 🧵
https://t.co/LzwK6FiNlE
🚨 BREAKING: GPT Image Killer is here. FLUX Kontext [pro] & [max] models are available on @fal day 0!
This isn't just your usual image editing model – it preserves identity AND maintains consistency with the rest of the image.
Try it today here -> https://t.co/00qzr3FZWR
Congrats 🎉 to @SanghaniCtrVT core faculty member @anujkarpatne, newly named Faculty Fellow @VTEngineering, an award given in "recognition of "extraordinary performance in research." With him at Dean’s Awards Celebration are @vtdeanross (L) and Christine Julien, head @VT_CS.
🌟Thrilled to share our paper, “A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations”, has been accepted at #ICLR2025 ! 🎉
Grateful to my co-authors, Naveen Gupta, @arkadaw_ , @YouzuoLin1 and @anujkarpatne, for their support.
Introducing Blendbox: a fresh way to create with AI
No more wrestling with long prompts or random results. Blendbox Alpha brings simplicity & control to AI art, so you can shape your vision directly.
Don't just generate. Create. https://t.co/CbMvhkI2HS
PhD then:
“We validate our method on a large-scale dataset of hundreds of images.”
PhD now:
“We validate our method on 10 different modalities, 15 domains, 20 scenarios, 25 tasks, and 200 languages, each with millions of testing examples”.
By 2030, I guarantee we'll all be wearing Zuckerberg's new AR glasses.
Not because they're cool.
Because they're the biggest tech revolution since the iPhone: 🧵
So, this is what we were up to for a while :)
Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio
One of the most exciting projects I got to tech lead at my time in Meta!
.@huggingface is hosting a demo site for #ECCV2024 🤗 Authors can claim their papers and discuss the papers on dedicated pages.
Demo page: https://t.co/LqpwVsGgD5