How are AI tools changing our writing process? In our #chi2026 paper, we find a shift in the way writers write and engage with ideas: from generating ideas to reacting to the ones AI surfaces. We call it reactive writing.
This paper would not have been possible without the collaboration and guidance of @m_jakesch, @marianneaubin and @informor
The paper is now available in open access! #chi2026
https://t.co/vOxDbl039P
How are AI tools changing our writing process? In our #chi2026 paper, we find a shift in the way writers write and engage with ideas: from generating ideas to reacting to the ones AI surfaces. We call it reactive writing.
This is algorithmic agenda-setting at the moment of ideation. As Bernard Cohen wrote of the press in 1963: it may not tell people what to think, but it is stunningly successful in telling them what to think about. AI writing assistants may now do the same for writers.
Today, we’re announcing the next chapter of Terminal-Bench with two releases:
1. Harbor, a new package for running sandboxed agent rollouts at scale
2. Terminal-Bench 2.0, a harder version of Terminal-Bench with increased verification
True intelligence = reasoning about new information, not memorized facts.
How can we scalably create benchmarks that are completely novel yet have known answers?
Meet SynthWorlds, an eval & data-gen framework to disentangle reasoning and knowledge⬇️🧵
📄https://t.co/ITwP4YdtDG
I’m excited to share our new @Nature paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health.
Details in thread.🧵
https://t.co/omO3YcHrvG
Terminal-Bench and @warpdotdev@zachlloydtweets in TechCrunch today :) (link in replies)
I firmly believe that the future of LLM-Computer interaction is through something that looks like a terminal interface. Great to see this picking up steam.
Congratulations to @UW#UWAllen Ph.D. grads @sharma_ashish_2 & @sewon__min, @TheOfficialACM Doctoral Dissertation Award honorees! Sharma won for #AI tools for mental health; Min received honorable mention for efficient, flexible language models. #ThisIsUW https://t.co/R2b1r3wxUP
Many agents (Claude Code, Codex CLI) interact with the terminal to do valuable tasks, but do they currently work well enough to deploy en masse?
We’re excited to introduce Terminal-Bench: An evaluation environment and benchmark for AI agents on real-world terminal tasks. Tl;dr lots of room for improvement! https://t.co/qEczwCmyoQ
📢 Excited to present our poster at #ICLR2025!
Agent S: An Open Agentic Framework that Uses Computers Like a Human.
Come explore how Agent S leverages Experience Augmented Planning to interact with computers like humans do!
📍Hall 3 + Hall 2B, Poster #408
🗓️ April 26th, 10 AM