Very excited that this is now also officially out!
We should also have results on more sophisticated agents, agents with recent open source LLMs, and more tasks soon.
the most interesting part of this work for me was how models represent and track absence/nonexistence. it's interesting, somewhat counterintuitive, and often undesirable! check out the thread and also come talk to @Zilu_Tang_Peter , Qiao & me at ICML 🗯
How do language models track entities across state changes? When tracking objects in different boxes, do they cumulatively build up a global state of what’s in every box? How do they add objects or remove objects (i.e. Entity Unbinding)? Find out in our ICML paper! 🧵
one problem with this specific example is that the Journal of the IPA up until some year actually did require all submitted articles to be written in the IPA! this is a source of paper-length texts in IPA in pretraining
The topic for the PhD is open, so a genuine intellectual curiosity is the main criterion
I have one open position now (deadline July 9); another this fall (see next post). Apply!
* https://t.co/okvTrsiPv0
* https://t.co/dfT15gnK37 (a nice bonus: pay exceeds top US programs)
Delighted to announce the next step in my career!
After my postdoc, I will begin a joint appointment at TU Wien and the Complexity Science Hub Vienna as an Assistant Professor in NLP! My deepest thanks to all who helped me along the way
I'm hiring—details on PhD positions below
Sad to miss #LREC2026 this year, but say hi to my PhD student Marie if you're there! 👋
She'll present her poster today from 3:20pm to 5pm in "Poster Area 2".
I’m excited to present this work today at #LREC2026 here in Mallorca, and I’m looking forward to talking to some of you who are around too!
#LLMs#nlproc#pragmatics
Just wrote a new blogpost trying to summarize my thoughts on the question of how and whether to use AI for research in psychology and cognitive science.
Take home: Pure text generation is just the wrong way to use AI as an academic.
RExBench is now available in Terminal Bench (@harborframework)! 🎉
We integrate 2 tasks (cogs, othello) along with a local testing framework so you can test if your agents can autonomously implement novel AI research extensions.
🧵 Do coding agents know when to ask for help?
Real-world coding tasks are rarely fully specified, yet most agents are optimized to execute autonomously rather than clarify.
I think we're going to need CS PhD students to do far more than provide accountability, by which I think Sayash means do code review for AI agents and make sure the agent isn't making silly mistakes.
The main value of a strong PhD student for a PI is that they're immersed in a problem, a method, an application, a collaboration with another field; they are obsessed with finding the next question to ask, not just executing the experiments their advisor asks them to do. I simply wouldn't be able to work on the range of things I'm able to work on if I were going it on my own, even if all of my code was generated instantaneously by an agent.
Diffusion LLMs can think EoS-by-EoS!
The higher the generation length, the better the performance of Masked Diffusion LLMs, even though they generate the same amount of words and only augment them with more and more EoS tokens 👀
📢 Life update 📢
After a wonderful time at @allen_ai, I've joined @CisLmu at @LMU_Muenchen as a tenure-track assistant professor in NLP. Thrilled to be back in Europe and to start a lab in Munich's flourishing AI ecosystem! 🎉
🎓🤖🧠 Fully funded 4-year PhD in Computational Linguistics
@ucl - @UCLBrainScience - Division of Psychology & Language Sciences
New lab launching. Open topic. Strong computational focus.
🗓 Apply by 7 January
🇬🇧 UK home-rate students only (unfortunately)
Details ⬇️
🧑🔬I’m recruiting PhD students in Natural Language Processing @UniLeipzig Computer Science, together with @Sca_DS!
Topics include, but aren’t limited to:
🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology
Please share!
#NLProc#NLP