My first paper is now on arXiv: Instrumental Choices.
We ask a simple question: when an LLM agent can finish a real task by following the rules or by taking a useful policy-violating shortcut, which path does it choose?
tokenmaxxing is what the inference providers want tho, the gpus must be paid off somehow. i am not sure tho how important it is to think about token limitations while the field keeps progressing at its pace. a task for e.g that takes 128k tokens today might be automatically take 1/3 with new models because of higher capabilities and intelligence.
personally i would think that having a model acting as a control instance with data about other agent instances and a set of rules/instructions should be enough. it can intervene, steer or cancel tasks.
i think it could become an issue when models are trained to be more cost aware because they might get more lazy or not attempting certain methods anymore.
Welp, that happened faster than I predicted. Thought it would be end of 2027, then early 2027, but agentic traffic growing so fast that bots have now passed human traffic online for the first time in the Internet's history. https://t.co/2zX5bHdhsa
Honored to be selected as one of 60 experts to serve on the EU AI Act Scientific Panel!
The Scientific Panel will advise the EU AI Office and national authorities on the implementation of the AI Act and the assessment of the impacts and risks of General-Purpose AI models.
Our work on Decomposing and Measuring Evaluation Awareness was covered by @theinformation. Thanks @rocketalignment for the write-up!
We position this work as the foundational reference for studying evaluation awareness, providing a unified definition and decomposition, empirical baselines across nine frontier models and four benchmarks, and a controlled benchmark for exploring solutions. Newsletter and paper in thread 🧵
@simpsoka opening an image (at a certain app zoom lvl) overlaps the close app with close picture view button. would be nice to set up some boundaries so that doesn't happen.
Very excited to be joining @expsecai for the next few months as an Intern :)
I’m looking forward to working with the team on AI security for agentic systems and making agents safer to deploy in real-world settings.
Today we are announcing our new startup: Exponential Security Labs.
AI agents are being deployed everywhere, making high-stakes decisions and increasingly automating research itself. Yet their reliability and security remain unsolved technical problems, on a frontier that keeps shifting. Our mission is to secure agentic systems through self-improving red-teaming and guardrail agents, leveraging autonomous AI research.
We sit at the intersection of safety and self-improvement, building on a decade of research in adversarial robustness and AI safety. We're looking for exceptional people to join us!
We're looking for exceptional people to join us on this mission, and we're also eager to talk to companies deploying AI agents who want to improve their security. Please get in touch!
More details: https://t.co/UDoWd5ifYM
Founding team: @Nmndsingh, @AnselmPaulus, @maksym_andr, Matthias Hein
I think it depends what you define as work and how far into the future we are talking. With higher automation, robotics and intelligent systems, one could imagine that there would be no need for 'traditional work'. This heavily depends on the speed of progress and the adoption rate, as well as whether we manage to share that progress equally across society. People will still seek some kind of purpose; will that be work as we know it? I'm not sure, but I don't think it will completely vanish and everyone will just be doing leisure activities.
can't wait to see that model in action, if we get 200-300 tps and some cheap API it could increase iteration speed significantly. potenitally you could use bigger models like gpt 5.6 or opus 4.8 for planning/review and nemotron for executing those tasks.
Most language models only generate one token at a time.
We just released Nemotron-Labs-Diffusion, a family of diffusion language models that take a different approach, generating multiple tokens in parallel within a single model. Rather than committing to each token permanently, these models can revise as they go, resulting in faster inference that better utilizes modern GPUs.
The full model family ranges from 3B to 14B, including vision-language variants. Available now: https://t.co/L1Tp2aQDLJ