🎉 Our paper on interpretable model mergeability prediction has been accepted to ICML 2026!
Excited to share this joint work on understanding when model merging succeeds, with amazing co-authors @bozhao, @yuqirose, and @EmanueleRodola 🙏
Happy to chat in Seoul! 🇰🇷
Can a sentence carry a sound?
In Communicating Sound Through Natural Language, we introduce lexical acoustic coding (LAC): a way for LLM agents to transmit short sounds as structured English, then re-render the same audio back from that text.
(1/6)
🎉 Our paper on interpretable model mergeability prediction has been accepted to ICML 2026!
Excited to share this joint work on understanding when model merging succeeds, with amazing co-authors @bozhao, @yuqirose, and @EmanueleRodola 🙏
Happy to chat in Seoul! 🇰🇷
Even SOTA vision-language models struggle with grounded reasoning on time series.
They fail at precise numeric & temporal understanding and often don’t properly use the visual signal.
“CaTS-Bench: Can Language Models Describe Time Series?”, recently accepted to Findings of ACL 2026, introduces a large-scale, real-world multimodal benchmark for context-aware time series captioning and reasoning (combining numeric signals, metadata, and plots).
Find out more 👉https://t.co/9umcAB1usr
🚀Our paper “CaTS-Bench: Can Language Models Describe Time Series?” has been accepted to ACL 2026 Findings!
It introduces a new multimodal benchmark for time series captioning and reasoning across various domains.
💡We find that current VLMs:
1) Struggle with numeric and temporal grounding
2) Largely ignore visual cues when reasoning
📷Check out more at: https://t.co/zPUJrmC1Df
My first ever conference attendance at @NeurIPSConf was incredibly fulfilling. From discussions on model merging to the latest breakthroughs in deep learning, it was the perfect environment to sharpen my thinking and explore new directions for my research. 🚀
Want to make model merging ⚡️ fast and effective?
Our new paper, accepted to the UniReps Workshop @NeurIPSConf, reveals a surprising insight: Merging models after just one epoch of fine-tuning is often as good as merging fully converged ones!
Why? We provide the first theoretical proof that a task vector is essentially a scaled gradient. This reframes task arithmetic as a form of approximate multitask learning.
I know you're probably thinking, "Yeah, these neuron-permutation-based model merging methods are cool.. but are they cycle-consistent (CC)?"
Say no more!
It just so happens that our new #NeurIPS24 paper covers exactly this!
Huh? No idea what I am talking about? Read on
(1/6)
↗️Task vectors? More like gradients🔽!
We show that, under certain assumptions, they’re actually deeply related.🔗
ATM: A game-changing framework for multi-task model merging with no hidden fees💵!
🔍What’s the paper about?
◾Discovered fascinating relations between task vectors and multi-task gradients.
◾Proposed ATM: an efficient SOTA framework for multi-task model merging.
◾Mathematically and empirically motivated the effectiveness of ATM.