You can easily save up to 65% of compute while improving performance on reasoning tasks ๐คฏ ๐
Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
๐งต๐
personal update: I'll be starting an internship at @cohere, working on code agents, one of the most interesting things happening in AI right now. feeling really grateful for this one and genuinely excited to see where it goes. here we go! ๐
You can easily save up to 65% of compute while improving performance on reasoning tasks ๐คฏ ๐
Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
๐งต๐
Presenting this tomorrow at EACL2026, poster session at 9am
If youโre around come say hi, happy to chat about the work and ideas
More details in the thread ๐
๐ข New paper: Applied interpretability ๐ค MT personalization!
We steer LLM generations to mimic human translator styles on literary novels in 7 languages. ๐
SAE steering can beat few-shot prompting, leading to better personalization while maintaining quality.
๐งต1/
@GoodfireAI Nice work! I wonder, probe trained on answer choices needs known options. What if you probe model confidence and early exit there regardless of the answer it's thinking? I feel like after some t the model already knows and the rest is just overthinking
@paradigmainc Ok I was trying to cook something to improve modelโs scientific creativity, throwing the repo into flywheel feels like the next logical step
Happy to announce I will be mentoring a SPAR project this Spring! โจCheck out the programme and apply by Jan 14th to work with me on understanding and mitigating implicit personalization in LLMs, i.e. how models form hidden beliefs about users that shape their responses.
Want models to translate in the style you actually like?
Our paper just got accepted at EACL Main ๐, check out our work on using interpretability for MT personalization!
And, see you in Morocco! ๐ฒ๐ฆ
๐ข New paper: Applied interpretability ๐ค MT personalization!
We steer LLM generations to mimic human translator styles on literary novels in 7 languages. ๐
SAE steering can beat few-shot prompting, leading to better personalization while maintaining quality.
๐งต1/
@Turn_Trout@GladiaLab SVs are approximate directions in the latent space. They look for exact matches in the latent space. This could make things harder, but Iโm still curious to know!
@andy_peng05@Cohere_Labs@UvA_Amsterdam Hi, thank you very much! Good catch on the p@1, it was meant to be p@k (pass@k). Weโll fix it asap in the preprint!
You can easily save up to 65% of compute while improving performance on reasoning tasks ๐คฏ ๐
Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
๐งต๐
Takeaway: EAGer shows we can be MORE efficient & MORE effective by letting models focus compute where it matters most.
๐Paper: https://t.co/JCHBWnKhUX
๐ปCode: https://t.co/Cp5nr38DLk
โจHuge thanks to my mentors and collaborators @ZotosLeo@FersiniE@MalvinaNissim@ahmetustun89
How can we make reasoning models more efficient without sacrificing performance?ย
Introducing EAGER, our new entropy-aware generation method, saving compute by up to 65% while lifting Pass@k by up to 37% on benchmarks like AIME.