1/
At Arc, we’ve been obsessing over a question: What does it actually mean for an AI system to get better over time?
As a result, we began to explore what it might look like for agents to adapt and learn at inference time.
Pre-Print Paper: https://t.co/h2NZ1g1LoK
We propose a shift, from model-centric retraining to system-centric inferencetime learning.
👉Atlas SDK v0.1.13 is out! Our new release drops a learning evaluation harness, configuration discovery along with a starter kit for adaptive tool use.
Full release notes: https://t.co/T0HUO9YqZq
Three features that streamline continual learning workflows for agent systems 🧵
Most importantly, try out our new MCP Tool Learning starter kit. We have 25 progressive tasks demonstrating how agents using Atlas dynamically learn to use MCP tools more effectively and efficiently.
Adaptive Tool Use Starter Kit: https://t.co/d2mvkPELsY
the synthesis of ideas/approaches here is impressive. you've got...
--reflective prompt evolution
--student teacher learning
--persistent memory
all in one in-context learning system. oh, and there's an SDK that's live to build with now too??
what are all these other "continual learning" startups even doing?
these guys are shipping, crushing benchmarks, and writing about it. unreal execution out of this team
7/
Blog: https://t.co/SBGx59Fpwi
Pre-print Paper: https://t.co/jEDh5EL270
DMs open if you're working on evaluation beyond static benchmarks, we are actively looking for collaborators who want to push this frontier
1/
At Arc, we’ve been obsessing over a question: What does it actually mean for an AI system to get better over time?
As a result, we began to explore what it might look like for agents to adapt and learn at inference time.
Pre-Print Paper: https://t.co/h2NZ1g1LoK
We propose a shift, from model-centric retraining to system-centric inferencetime learning.
6/
Generalization: we paused memory and removed the Teacher.
Cross-incident transfer (Incident #55):
28/100 to 41/100 correct (+46%) with zero retraining.
Output shifts from lengthy exploration to deliberate reasoning.
We demoed Arc ATLAS at AI Show & Tell in NYC!
We posed a fundamental question that drives everything we do: What if we built agentic systems that learn like humans do?
What would need to be true for that to happen?
Our keynote breaks down our research journey in pursuit of this question.
Check it out below:
5/5
Stop engineering prompts and start building compounding intelligence.
Try the Atlas SDK and see the J-curve for yourself.
GitHub: https://t.co/BZW3OH0nYv
Docs: https://t.co/zVwgZ8Og9I
1/5
Want to enable real-time, continual learning for your agents?
Our Atlas SDK is a drop-in harness that lets any agent learn from experience.
See the results for yourself in this demo, as well as how to get started
4/5
The practical outcome is a shift from verbosity to efficiency.
Before Atlas: A long, unstructured report from a frontier model.
After Atlas: A perfectly structured, efficient JSON output.
Faster responses, fewer tokens, better results.
This is continual learning at scale.