Check out our new preprint introducing on STATe-of-thoughts, a new approach supporting diversity, controllability and interpretability in LLMs. Led by @ZacharyBamberg1 and @trsaenger
Tree-of-Thoughts & Best-of-N use high-temperature sampling to elicit diverse responses, but they're often very similar.
STATe-of-Thoughts replaces high-temperature sampling with "branching" through pre-filled textual interventions
🔼Diversity
🔼Control
🔼Interpretability
1/11🧵👇
Our paper From Actions to Words: Towards Abstractive-Textual Policy Summarization in RL was accepted to AAMAS 2026: https://t.co/2wTxybyJR4 (led by @saharad1)
We study how to explain reinforcement learning policies at a global level by summarizing agent behavior patterns
Across MiniGrid and Crafter, expert evaluations show close alignment with human analyses, and a user study indicates a preference for textual summaries over demonstration-based explanations.
@OdedRechavi If you haven’t seen this, maybe relevant
https://t.co/xIYrvhQNXZ Can Large Language Models Provide Useful Feedback on Research Papers? A Large-Scale Empirical Analysis | NEJM AI