🤖 LLMs can reason… but can they explore?
Excited to share my first paper with @Kenneth_Marino and @aplycaebous at @UUtah :
DORA Explorer — a training-free inference time framework for better exploration in LLM agents.
🧵1/4
If you've been curious about all the AI agent hype lately, you should definitely check out this course. It's a great starting point for CV/NLP/RL people who want to get into this domain.
Hi all. I just put up my course materials for my Multimodal LLM Agents course at @UUtah online so it can be used by the broader community. I really tried to tie together LLMs with RL and CV to give students the full picture of how MLLMs can be incorporated into embodied agents.
UI robustness could be the next frontier for web agent evaluation. We built TimeWarp to start that conversation!
TimeWarp is open and integrated with BrowserGym.
📄Paper: https://t.co/wkA7UroCO1
💻Code: https://t.co/dRhLAwEQTe
🌐Project Page: https://t.co/e5BvhXHtTA
🧵(5/5)
First paper of my PhD at the University of Utah, with Prof @Kenneth_Marino.
Super excited to finally share what we've been working on at SparkLab. Meet TimeWarp ⏳, a benchmark that tests web agents by sending them back in time through 6 eras of web UI design.
Thread 🧵(1/5)
Key Findings:
• VLM agents are more vulnerable to UI changes.
• Behavior cloning on action only doesn’t generalize well across versions. But TimeWarp-BC does!
• Continual training on new versions causes catastrophic forgetting.
• Training on non-action tokens helps.
🧵(4/5)
It’s been less than a year since I started my lab (SPARK Lab) at @UUtah we already have a ton of new stuff that I can’t wait to talk about soon. Stay tuned for more.
I’ll start today by sharing that our updated Computer Use Survey blog has been accepted to ICLR Blogposts 2026.
Collaboration with my student @aplycaebous and Utah colleague @anmarasovic.