As part of @PrimeIntellect's RL residency program, I've been exploring how to do multi-agent RL using their current stack (from verifiers + prime-rl to lab experiments with hosted training /evals) and thinking about how it could be extended to support these abstractions natively. I've summarized my findings the blogpost below and I'll leave a few comments here, too...
The next wave of AI will not be won by better prompts. It will be won by systems that learn from experience.
Today, Prime Intellect Lab is out of beta, open for you to start training your own models.
The era of self-improving agents is here.
As part of @PrimeIntellect's RL residency program, I've been exploring how to do multi-agent RL using their current stack (from verifiers + prime-rl to lab experiments with hosted training /evals) and thinking about how it could be extended to support these abstractions natively. I've summarized my findings the blogpost below and I'll leave a few comments here, too...
@blaiseaguera Hey this is really cool! I read your book recently too, and enjoyed it a lot.
I just shared this work yesterday
I think it’s very related, and I do mention your book early on in the post.
Would love to talk if there’s an opportunity :)
As part of @PrimeIntellect's RL residency program, I've been exploring how to do multi-agent RL using their current stack (from verifiers + prime-rl to lab experiments with hosted training /evals) and thinking about how it could be extended to support these abstractions natively. I've summarized my findings the blogpost below and I'll leave a few comments here, too...
I discuss some more details in the blogpost (https://t.co/HR5vMgUQFD). I'm very excited to see what comes out of this, and related work in the residency, like @BillyHoy1_'s stuff - hopefully it will spark more work on open multi-agent RL!