🎉 Excited to share our latest work: "Reward Is Enough: LLMs Are In-Context Reinforcement Learners” just presented at ICLR 2026.
Summary:
🤔 What if an LLM could teach itself to get better — just from a reward score — without any retraining?
#TestTimeScaling
ICRL — a minimal multi-round framework where an LLM sees its own past responses alongside their scalar reward scores, and iteratively self-improves. No gradient updates. No textual gradients. Just: try → get a number → try better. 🔁 The results are striking:
Sequential Tool Attack Chaining - https://t.co/s6FVOIXzON
The AI safety community has fundamentally misallocated its research priorities. While extensive investigation addresses hallucination, bias, and toxicity in LLMs, there is an equally, if not more, critical vulnerability that threatens safe deployment: the inability of these systems to understand context and user intent.
To address this gap, we introduce and investigate Sequential Tool Attack Chaining (STAC)—a novel category of multi-turn attacks targeting tool-enabled LLM agents. STAC exploits a unique vulnerability of agents by orchestrating sequences of seemingly innocuous tool calls that individually pass safety checks but collectively achieve harmful goals. Unlike prior multi-turn attacks that aim to elicit unsafe text responses, STAC drives agents into performing harmful tool calls.
This paper positions contextual blindness as the most exploitable weakness in contemporary LLMs, rendering existing safety mechanisms inadequate against determined adversaries.
Authors: @drjingjing2026, Jianfeng He, Chao Shang, Devang Kulshreshtha, Xun Xian, Yi Zhang, Hang Su, Sandesh Swamy, @Qdatalab - @AWSAI, @UCBerkeley
#AISecurity #LLMSecurity #AgentSecurity #PromptInjection #ToolSecurity #ToolChaining #AIJailbreak #AdversarialML #RedTeaming #GenAI #Cybersecurity #MLSafety #STAC
🚀 Exciting Code Release Alert! 🚀
GitHub: https://t.co/DJytC3iOaF
Get ready to explore the latest code sharing with TurboFuzzLLM! 🌟 -- BEST template based LLM jailbreaking method!
https://t.co/9BPlKjPHYD
Taking GPTFuzz to the next level, we introduce TurboFuzzLLM, a significantly improved and more efficient method. (Source: https://t.co/3lU5ZWSKw7) -- 3x reduction in queries while generating 2x more jailbreaking templates automatically
TextAttack: TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://t.co/044cbt6Gj0
Lang: Python
⭐️ 2401
Author: @Qdatalab#MachineLearning
https://t.co/ioSDMvVH76
We find that ESMFold does much better than Alphafold2 when both are given just single sequence input. This has implications for de novo design and metagenomic sequences without any homologous sequences.
ML Course Notes (3000⭐️)
ICYMI, this repo provides detailed notes on deep learning topics.
I'll be releasing the first set of notes on Deep Learning for NLP this coming month. Stay tuned!
https://t.co/f3G6ARdH11
Reasoning as energy minimization!
One of the key points of my recent position paper on autonomous machine intelligence
https://t.co/EmT1On8Y9I
(and of my 2006 "tutorial on energy-based learning").
MLOps Primer
If you are curious about MLOPs and why it matters in designing ML systems, I've put together a collection of my favorite references.
Check it out: https://t.co/YrOkyTVxTg
Lie: The world is a zero-sum game.
If it bothers you to see other people succeed, you’re definitely not gonna make it.
Distance yourself from anyone who spends time bringing others down.
Celebrate everyone’s wins and you’ll start winning more.
A rising tide lifts all boats.
I remember breaking down to my Ph.D. advisor about how stupid I felt while troubleshooting a problem in my project when she handed me this paper. Years later it is still relevant to young students starting off in Science. It's ok to feel stupid, we all do on a regular basis.
After 2 years of work by 442 contributors across 132 institutions, I am thrilled to announce that the https://t.co/wezEGzDEHt paper is now live: https://t.co/4Yg36EB9Ru. BIG-bench consists of 204 diverse tasks to measure and extrapolate the capabilities of large language models.
Ten years ago, after reading the sandy hook tragedy, I cries for days . This time, could not even read the news on Uvalde tragedy. Just a peek of titles made me into tears. WHY,WHY, after 10years, this type of tragedy happened again?