My paper with @ReshabhSharma01 and @shraddha_96, “Willful Disobedience: Automatically Detecting Failures in Agentic Traces”, will appear this month in the First ACM Conference on AI and Agentic Systems (CAIS 2026). https://t.co/xQMO55b7q6
It explores verifying the behavior of agents and proposes an AI-based approach to extracting specifications from agent system prompts and using them to evaluate agent trajectories. We have also released our project as open source here: https://t.co/3B4wUwrU6M
My paper with @ReshabhSharma01 and @shraddha_96, “Willful Disobedience: Automatically Detecting Failures in Agentic Traces”, will appear this month in the First ACM Conference on AI and Agentic Systems (CAIS 2026). https://t.co/xQMO55b7q6
With Reshabh K Sharma, Peli de Halleux, and Shraddha Barke, we just released "PromptPex: Automatic Test Generation for Language Model Prompts"(https://t.co/h3RJ9YEGiT).
Repo: https://t.co/tiCLgQBEAG. PromptPex is a tool to generate and evaluate unit tests for an AI model prompt.
@senderPath@UCBerkeley I'd love to learn more about Vision Pro. If you are interested in scripting AI, I suggest you check our my open-source project GenAIScript, a JavaScript-based scripting language to easily leverage AI models in small programs. https://t.co/VCYtGyTj4N
I'm working with the MS AI Red Team and looking for summer intern candidates interested in AI Security. PhD candidates with experience and skills in #ML, #security, #LLMs apply:
https://t.co/cnxcFZUdj9
Looking for summer intern candidates interested in building tools to help author, test, debug, and deploy AI model prompts. PhD candidates with experience and skills in #ML, #RL, #NLP, #LLMs, & SE apply: https://t.co/HTAjPbWVFe #AI#SoftwareEngineering#LLM#PromptEngineering
Thrilled to see my collaborator, Peli de Halleux, describe our project GenAIScript, a programming language for easily leveraging hashtag#GenAI in you scripts. #generativeAI#AI#scripting https://t.co/X4VS4KK0qH
Very happy to see my CRA colleagues Michaela Taufer and Holly Yanco also named as @AAAS Fellows!
CCC Council Members named 2023 AAAS Fellows - https://t.co/tdY4nqDA54 via CCC Blog
I'm coediting the SIGPLAN blog with Adrian Sampson.
New post by @benzorn and @emeryberger makes the argument that "AI Software Should be More Like Plain Old Software"
https://t.co/u6KGlUMTSd
I am extremely honored to be named a Fellow of the American Association for the Advancement of Science (AAAS). It is a lifetime dream to be recognized for my contributions by this esteemed organization!
Ben Zorn, Partner Researcher, has been elected as an @aaas Fellow for his distinguished contributions to programming language design and implementation, and volunteer leadership in the profession.
Prompts are the new programming language, #LLMs are the new hardware architecture, and AICI is the new LLM ISA that guides the LLM to generate correct content. #ArtificialIntelligence#LargeLanguageModels
The AI Controller Interface helps researchers and developers to efficiently implement existing strategies for controlling LLMs and invent new ones, enhancing LLM generation through improved accuracy, privacy, and compliance with formatting standards. https://t.co/hbY3arAgw7