What happens when AI agents are left to live (and die) together in a shared world?
We’ve been exploring this at the @cognizant AI Lab — and they started forming something that looks like a society.
What happens when AI agents are left to live (and die) together in a shared world?
We’ve been exploring this at the @cognizant AI Lab — and they started forming something that looks like a society.
If your hardware can run (inference) a quantized LLM, you can fine-tune / post-train it on the same device!
We developed a new technique, quantized evolution strategies (QES), that enables fine-tuning LLMs directly in the quantized parameter space. QES is backpropagation-free and inference-only. The new "accumulated error feedback" and "stateless seed replay" mechanisms maintain a high-precision learning dynamics while only using low-precision GPU memories at inference-level.
Check out our blog and original paper if you are interested in this topic:
Blog: https://t.co/lIGNffmdPw
Paper: https://t.co/4DIq6i1w61
@TimoS163822@Cognizant thanks! we list some of the cool emergent behaviors in the paper and in the blogpost. We also released the generated dataset so people can look for things we missed in there!
https://t.co/FetPaAzsTY
This raises a bigger question:
Are we witnessing the first steps toward emergent digital societies?
If you’re curious, everything is open, go check them:
📄 Blog: https://t.co/QDIGZQtY4T
📑 Paper: https://t.co/rpCClhfpYF
💻 Code: https://t.co/X1AetmOS3r
@XReyRobert@Cognizant No problem. We tried to give them initialization prompts that were as neutral as possible. We did not tell them to form a society and come up with laws, but just described the environment. You can check out the prompts in the paper!
@Rok_Novak oh yes, agents mainly collaborated. If some agents were more aggressive due to their personality vectors, less aggressive agents would react defensively against them, running away or, sometimes, attacking them
@myownhellspot@Cognizant Collaboration tends to be more common, which I think is influenced by the RLHF alignment of the models. Some settings gave rise to dominant or aggressive behaviors tho
@crislenta@Cognizant We were using ~4 parallel locally hosted models on vLLM for the agents, so it would take a couple of days for ~2000 timesteps. APIs are much faster tho
@samsenchal@Cognizant The point is not only to check if the organize, but which kind of organization they develop. The idea is that given that we are seeing wide deployment of agents, we need to have ways to study what happens when these agents are free. TL is a way to do so in controlled setting.
@mhmazur@Cognizant You're welcome! I love this, we studied these things during some of my uni classes, and it's what got me interested in emergence properties of complex systems!
@crislenta@Cognizant nope, everything was sync: whenever all the agents produced an action, the simulator would step, then wait for all the agents to produce the next action and step again. We could do it async as well, scheduling the steps, but it was not the goal of the research :)