Associate Professor, CMU. Researcher, Google. Evaluation and design of information retrieval and recommendation systems, including their societal impacts.
the information retrieval community has long known that structured queries provide more precise and surgical interaction with a corpus compared to keywords. together with folks at umass, led by @SalemiAlireza7, we show the effective and efficient use of the same tools by agents.
1/n
Do search agents always need index-based retrievers to work efficiently & effectively?🤔Maybe not, if you TEACH them to interact with the corpus by shell!🤯
GrepSeek, a paradigm for training fast & practical Direct Corpus Interaction Search Agents!🚀
https://t.co/EqR3M5aVtX
🎉 Excited to share a new agent evaluation perspective with our newly accepted #SIGIR2026 paper:
"Evaluation of Agents under Simulated AI Marketplace Dynamics."
https://t.co/463TyQasHa
Our paper “Diversification as Risk Minimization” received a Best Paper Award at WSDM 2026!! (1/799) 🏆
We tackle the problem of minimizing missing relevant information in a search ranking.
Huge thanks to my collaborators!
For a deeper dive, check out my original thread: 👇
Congratulations to @rikiyatakehi on this well-deserved #WSDM2026 Best Paper Award.
He provided the vision and carried the work forward with remarkable maturity.
I very much valued the collaboration with him and @tetsuyasakai.
Sincere thanks to the selection committee.
"Professors definitely deserve to have their names on the papers."
I think this take is completely wrong. Financial support does not warrant co-authorship.
Bob Gallager (a legendary information theorist who retired from MIT) did not co-author any papers with many of his students because he did not believe that he made an intellectual contribution that warranted co-authorship.
The screenshot is from Erdal Arıkan's PhD thesis work that was published in IEEE Trans. Information Theory. Both Erdal and Bob have been honored with the Shannon Award (highest honor in information theory) and they have not co-authored any papers.
Happily surprised to see OpenAI curating cultural benchmarks, especially focused on India.
BUT, cultural knowledge != culturally aligned generations.
My work for 2+ years focuses on cultural competence in generative tasks, like creative writing.
Sharing some papers in LONG 🧵
Ever trusted a metric that works great on average, only for it to fail in your specific use case?
In our #NAACL2025 paper (w/ @841io), we show why global evaluations are not enough and why context matters more than you think.
📄 https://t.co/IuLsoqRSnV
#NLP#Evaluation
(🧵1/9)
If you're interested in OpenAI including shopping results, you might also be interested in @TEKnologyy's paper relating retrieval diversity/fairness and generation by downstream RAG models. This has implications for individuals selling products online.
https://t.co/kbDRdE3gP4
If you're interested in OpenAI including shopping results, you might also be interested in @TEKnologyy's paper relating retrieval diversity/fairness and generation by downstream RAG models. This has implications for individuals selling products online.
https://t.co/kbDRdE3gP4
Today we'll be presenting the Tutorial on Retrieval-Enhanced Machine Learning (REML). Come by to learn about the emerging design patterns in this space and see how retrieval can be used beyond RAG.
In collaboration w/ the amazing @841io@TEKnologyy@SalemiAlireza7@HamedZamani
The AI Interdisciplinary Institute at @UofMaryland (AIM) is hiring
40 new faculty members
in all areas of AI, particularly:
- accessibility,
- sustainability,
- social justice, and
- learning;
building on computational, humanistic, or social scientific approaches to AI.
>
I'm on the job market for tt faculty positions! I conduct research to resist harmful tech, data and AI and empower communities to imagine and build better data and AI futures. Please let me know if there are any positions I might be a good fit for!
What a coincidence that we released announcements about LLM bias on the same day! But our conclusions were different - OpenAI found minimal bias while we found significant bias. 👀
Why is this? 🧐
🧵
Do models personalize results when we ask them to and avoid stereotypes otherwise? No.
Well are they at least transparent about it? Also no…
⚠️ If the model can infer your race you might get racially biased recommendations!
📄 Preprint: https://t.co/vzCadLiG1D
🧵1/8