I will attend #EMNLP2024 at Miami next week! If you are interested LLM explainability, formal reasoning and/or multilingual NLP, please DM me and connect๐. I'm ready for aโ talk every day! Also, please find me on Nov 13th 10:30-12:00 at poster session 6!
Frontier models have become excellent at understanding videos. But what happens when we test them outside the comfort zone of Western, English-centric data? In our #CVPR2026 (Highlight) work, we pushed these models to their limits to see if they can function effectively in diverse global contexts. The results? They are struggling. Work done with @NagraniArsha@skawshik11@Harman26Singh@dinesh_tewari1@0xtob@CordeliaSchmid Anelia Angelova @shachi_dave (1/7)
Check out Proactive Co-Creator on
@GoogleAIStudio , a human-AI belief alignment demo I vibe coded: https://t.co/BEcchP0K89
๐ง See & edit the AI's uncertainty via belief graph. It asks clarifying questions before creating!
๐ท Try Image โ Story โ Video. You can even remix it!
Optimize pretraining not just for loss, but for robustness to future updates.
The "best" base model does not always make the best final model.
๐ More in the paper: scaling results, Hessian analysis, and practical recipes https://t.co/yiYMIwOblM
Huge thanks to my collaborators: @CatherineL11638@goyalsachin007@jacspringer@AdtRaghunathan
9/9
๐ #EACL2026 Sneak Peak Alert ๐
We're excited to share a paper that we are presenting at #EACL2026 in #Morocco!
๐ Can LLMs Reason over Extended Multilingual Contexts? Towards Long-Context Evaluation Beyond Retrieval over Haystacks
๐ฅ @AmeyHengle@prasNLP Soham Dan @Tanmoy_Chak
Thrilled to see our paper accepted at AISTATS 2026!
Grateful to my co-authors, this was a fun deep dive into interpretability, control, and causal prompt edits. ๐
Thrilled to share that our paper on "Interpreting and Controlling Model Behavior via Constitutions for Atomic Concept Edits" has been accepted at AISTATS 2026! ๐๐
Read more about how input mutations can be mapped to interpretable behavioral insights.
https://t.co/iRPRJoyAso
๐งต
Thrilled to note that we are keeping the tradition of the awesome AI residency program alive in a new avatar: pre-doc researcher program at GDM-Blr -- with some amazing work done by our recent predocs including @gautham_ga_@pranamyapk@puranjay1412@sahilgo6801@swaroopnath6
If you want to join this program, please apply here: https://t.co/xaL0cnv3ub
@SuryaDoesIt@GoogleDeepMind Hi Surya! Usually the applications roll out near year end. Following initial screenings, there is a series of interviews and evaluations.
NAACL 2025 ๐
Presenting โMultilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Modelsโ
Paper Link : https://t.co/2OzMVvVWbT
Kicking off the year with a bang -- 4 papers accepted in prestigious venues this month!
#ICLR2025 -- ๐๐๐ ๐๐จ๐ฆ๐ฉ๐ซ๐๐ฌ๐ฌ๐ข๐จ๐ง:
We introduce ๐๐ซ๐ฎ๐ง๐๐๐๐ญ, a novel, dataset-free policy learning approach to model pruning, achieving high compression efficiency and performance retention, demonstrated by compressing LLaMA-2-7B with over 80% zero-shot accuracy retention at a 30% compression ratio. @iclr_conf
URL: https://t.co/dMlPbrT9Ju
#๐๐๐๐๐2025 -- ๐๐ง๐ฏ๐๐ฌ๐ญ๐ข๐ ๐๐ญ๐ข๐ง๐ ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฅ๐ข๐ง๐ ๐ฎ๐๐ฅ ๐ฅ๐จ๐ง๐ -๐๐จ๐ง๐ญ๐๐ฑ๐ญ ๐๐๐ก๐๐ฏ๐ข๐จ๐ซ ๐ข๐ง ๐๐๐๐ฌ:
We introduce ๐๐๐๐๐๐๐ฅ๐, the first systematic evaluation of multilingual long-context retrieval in LLMs, revealing significant performance variations across languages and context positions, with insights to guide future evaluations. @naaclmeeting
Preprint: https://t.co/7VSCn57vxn
๐๐๐๐๐'25 -- ๐๐จ๐ฎ๐ง๐ญ๐๐ซ๐ฌ๐ฉ๐๐๐๐ก ๐๐ฏ๐๐ฅ๐ฎ๐๐ญ๐ข๐จ๐ง ๐๐๐ง๐๐ก๐ฆ๐๐ซ๐ค ๐๐ง๐ ๐ฆ๐๐ญ๐ซ๐ข๐๐ฌ:
We introduce ๐๐๐๐ฏ๐๐ฅ, a dataset for evaluating counterspeech across four dimensions and a prompt-based framework using auto-calibrated CoT, offering better alignment with human judgment than traditional metrics. @naaclmeeting
๐๐๐ญ๐ฎ๐ซ๐ ๐๐๐๐ก๐ข๐ง๐ ๐๐ง๐ญ๐๐ฅ๐ฅ๐ข๐ ๐๐ง๐๐:
In collaboration with AIIMS (All India Institute of Medical Sciences, New Delhi), NIMHANS, Bangalore and other NGOs, we wrote how GenAI can potentially empower multisectoral suicide prevention efforts, particularly in resource-constrained settings like India. @NatMachIntell
๐ ๐ ๐๐๐ฐ T๐๐ฑ๐ญ๐๐จ๐จ๐ค -- ๐๐ง๐ญ๐ซ๐จ๐๐ฎ๐๐ญ๐ข๐จ๐ง ๐ญ๐จ ๐๐๐ซ๐ ๐ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ๐ฌ ๐
I am excited to share the release of my new textbook, ๐๐ฏ๐ต๐ณ๐ฐ๐ฅ๐ถ๐ค๐ต๐ช๐ฐ๐ฏ ๐ต๐ฐ ๐๐ข๐ณ๐จ๐ฆ ๐๐ข๐ฏ๐จ๐ถ๐ข๐จ๐ฆ ๐๐ฐ๐ฅ๐ฆ๐ญ๐ด (#LLMs) -- Perhaps the first textbook on LLMs.
Target Audience:
๐ Students/beginners, Looking for a structured starting point to learn LLMs
๐ Teachers, planning to offer a course on LLMs
๐ Industry professional, seeking to deepen their understanding of LLMs
Explore the Book:
๐ Book Website: https://t.co/tLQDfkhKOp
๐ Table of Contents: https://t.co/RyxBENqYEw
๐ Available on Amazon: https://t.co/PX2i3OLmiG
Enhance Your Learning Experience:
๐ Slides & Lecture Videos: Chapter-wise resources -- https://t.co/9bd9lhhfAj
๐ Exercises & Solutions: Practice with detailed chapter exercises (solutions available on request).
๐ Upcoming @nptel_official Course: Starting January 2025! Preview here: https://t.co/i71bgfVne8
Book Endorsement:
๐ Foreword by Prof. Tim Baldwin @eltimster
๐ Endorsements from Prof. Iryna Gurevych @IGurevych and Prof. Pushpak Bhattacharyya
#LLMs #Textbook @iitdelhi@WileyIndiaPL@lcs2lab
We also find that LLMs struggle to give proper attention to parts of queries, which are grounded in highly popular entities.
Check out the full paper for more key insights, real-world implications and detailed methodology : https://t.co/FxhQmXqHeQ
We also assess this impact critical limitation under the lens of sensitivity towards lexical variations of the queries. We unveil a key weakness in modern LLMs, in being internally sensitive to lexical perturbations, while retrieving highly popular facts from their memory.