🚀 CVPR, here we come!
The MHUG team is bringing a massive lineup of 17 papers this year, and we couldn't be more excited to share what we've been building! 🧠💻
🧵👇
Excited to present our #NeurIPS2025 paper on "λ-Orthogonality Regularization for Compatible Representation Learning" today! 🎉
Come chat with me at Poster #2503 in Exhibit Halls C, D, E
📅 Fri, Dec 5 | 🕚 11am – 2pm PST
#MachineLearning#AI#Compatibility#RepresentationLearning
λ-Orthogonality: a tiny near-orthogonal adapter with a controllable λ "knob" to balance geometry preservation and adaptation. Enables backward-compatible embeddings (no full backfill) with near new-model performance and <50% gallery updates. @NeurIPSConf#NeurIPS2025
Encouraging to see compatibility discussed in this NeurIPS workshop. The theoretical understanding of the problem’s structure is still limited, with room for useful progress. @NeurIPSConf#NeurIPS2025
https://t.co/rvR943BM5o
LeJEPA: a novel pretraining paradigm free of the (many) heuristics we relied on (stop-grad, teacher, ...)
- 60+ arch., up to 2B params
- 10+ datasets
- in-domain training (>DINOv3)
- corr(train loss, test perf)=95%
Paper: https://t.co/NpfB9G1pOP
Code: https://t.co/BsK5wmNEHc
New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work!
Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol).
I covered:
* Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching)
* "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D
* Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup
* Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :)
* Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model
Lots of examples, lots of visuals!
---
I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened.
Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy).
I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist.
If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts.
---
Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing!
Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!
Calling all AI researchers & comic lovers! 🚀
Can your model pick the right comic panel? 🤖
Join the #ICDAR2025 Comics Understanding Competition & submit your results by 15/04/2025! 📆
Challenge: https://t.co/G5kHwUtkVi
Dataset: https://t.co/7HGYY0MNiU
Submissions are open!
Are you a professional working on video in Milan ? Come to the Milan Video Tech Meetup this Thursday (Nov. 7, 6:00 PM CET) at the NTTData headquarters, Via Ernesto Calindri, 4.
I'll talk about how to use GenAI to improve the performance of video codecs.
https://t.co/SZT3hQ4Ors
🚀 Heading to Melbourne for #ACMMM2024!
Join our Learning Backward-Compatible tutorial 🧩 on Oct 28th, where we survey compatibility methods with fresh interpretation in the era of LLMs & VLMs.
💻 Packed with code examples and practical insights!
See you there @ACMMM24 🇦🇺🎉!
🚀 Heading to Melbourne for #ACMMM2024!
Join our Learning Backward-Compatible tutorial 🧩 on Oct 28th, where we survey compatibility methods with fresh interpretation in the era of LLMs & VLMs.
💻 Packed with code examples and practical insights!
See you there @ACMMM24 🇦🇺🎉!
🔔 MICC News from #ECCV2024!
The 18th European Conference on Computer Vision is live in Milan, Italy, from Sept 28 to Oct 4.
👏 Huge congrats to our MICC researchers showcasing their work in workshops, posters, and tutorials! #computervision#research#miccunifi#unifi#milan
💥 But that’s not all! We'll be in Milan for #ECCV2024 with our work, "ComiCap: A VLMs Pipeline for Dense Captioning of Comic Panels" 🚀
This paper presents a VLM-based approach to generate dense captions for comic panels (no 🚂), and a new metric !
https://t.co/XmYYyRFxBt
🚀 Exciting news! Our paper "CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding" has been accepted at #NeurIPS2024 (B&D)!
It sets a new standard for multi-task comic analysis, please dive into the details here: https://t.co/EI3DXqQcpu, and: see you in Vancouver!
Apple🍏shows interest in strategies for evolving🔄compatible🤝language models that minimize inconsistencies with previous versions:
https://t.co/506wcW7pUD
📢 Interested in low-level vision? Join our UHD-IQA challenge, held in conjunction with the AIM 2024 workshop at #ECCV2024@eccvconf 🇮🇹
🚀 The top two contenders will win a PlayStation 5! 🎮
Challenge: https://t.co/kdxGQMowsK
Workshop: https://t.co/5hm99Wdl78
Stop by board #447 in Arch 4E today, June Friday 21st, to see our🌟 highlight🌟poster🖼️:
"Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements".
📌Join us from 5:15-6:45 PM PDT! #CVPR2024