@kalomaze or add a level of indirection and frame it as a task where the chat history is (possibly) disconnected from the context and it interacts with it like you would a database
in python, I push my agents to specify everything as explicit state machines with states, transitions, and effects all spelled out.
it's easier for the model to mine the logs after the fact to figure out where the bugs are and what it modeled incorrectly. it can also reason about it visually when you graph it
Alright, it's time for a paper thread about my own first ever vision paper, which is having a bit of a moment on twitter rn thanks to @PINTO03091 and @yacineMTB.
BiternionNets: continuous head orientation from discrete labels.
Demo video from ~11y ago:
Excited to release ๐Polar๐, our Agent RL rollout infra for real-world harnesses. Be it Codex, Claude Code, OpenClaw, Hermes, or your self-made ones ๐ฅ -- Polar takes your harnesses directly as training environments without code change.
Find a problem, design the harness, and train your own agents! ๐งต
A little over 2 years ago, I solved the SolidGoldMagikarp stability problem.
Today, I am releasing the results of that work as a new technique to regularize training.
More details below.
The questions posed by AI are bigger than the AI community. We urgently need the world โ religions, civil society, academics, governments โ to participate in creating a positive outcome.
I'm glad the Catholic Church is engaging, and honored to speak at the presentation.