I am often asked what makes Chai special. They want to hear about our research secrets. About our data strategy. About our funding.
It starts and end with the people. Always does, always has.
@mihirbafna14 and I are excited to introduce Promera, a co-folding and design model with
• best-in-class binder filtering
• nanobody design with in-silico success rates matching hallucination
• case studies on hantavirus epitope targeting and GPCR agonism (1/8)
Incredibly proud of the Chai team. It's a privilege to wake up every day and go to work on models that are actively used by some of the world’s leading pharma companies, as they go about designing the medicines of the future. There is no higher calling.
Today we are announcing our collaboration with Pfizer to put Chai's frontier AI—including our latest model, Chai-3—directly into the hands of one of the world's leading pharmaceutical teams.
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project.
This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.:
- It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work.
- It found that the Value Embeddings really like regularization and I wasn't applying any (oops).
- It found that my banded attention was too conservative (i forgot to tune it).
- It found that AdamW betas were all messed up.
- It tuned the weight decay schedule.
- It tuned the network initialization.
This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism.
https://t.co/WAz8aIztKT
All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges.
And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Memo: What If We're Right?
We recently wrote a private letter to partners & friends of a common failure mode: the inability to consistently reason through the daisy chain of downstream consequence when non-consensus, low-probability, events actually occur
pages: 1-3
Tonight, we reached an agreement with the Department of War to deploy our models in their classified network.
In all of our interactions, the DoW displayed a deep respect for safety and a desire to partner to achieve the best possible outcome.
AI safety and wide distribution of benefits are the core of our mission. Two of our most important safety principles are prohibitions on domestic mass surveillance and human responsibility for the use of force, including for autonomous weapon systems. The DoW agrees with these principles, reflects them in law and policy, and we put them into our agreement.
We also will build technical safeguards to ensure our models behave as they should, which the DoW also wanted. We will deploy FDEs to help with our models and to ensure their safety, we will deploy on cloud networks only.
We are asking the DoW to offer these same terms to all AI companies, which in our opinion we think everyone should be willing to accept. We have expressed our strong desire to see things de-escalate away from legal and governmental actions and towards reasonable agreements.
We remain committed to serve all of humanity as best we can. The world is a complicated, messy, and sometimes dangerous place.
BREAKING: A letter from Alex Pretti’s Final Nursing Student:
“I was Alex Pretti’s final nursing student. He was my friend and my nursing mentor. For the past four months, I stood shoulder to shoulder with him during my capstone preceptorship at the Minneapolis VA Hospital. There he trained me to care for the sickest of the sick as an ICU nurse. He taught me how to care for arterial and central lines, the intricacies of managing multiple IVs filled with lifesaving solutions, and how to watch over every heartbeat, every breath, and every flicker of life, ready to act the moment they wavered. Techniques intended to heal.
Alex carried patience, compassion and calm as a steady light within him. Even at the very end, that light was there. I recognized his familiar stillness and signature calm composure shining through during those unbearable final moments captured on camera.
It does not surprise me that his final words were, “Are you okay?” Caring for people was at the core of who he was. He was incapable of causing harm. He lived a life of healing, and he lived it well.
Alex believed strongly in the Second Amendment and in the rights rooted in our Constitution and its amendments. He spoke out for justice and peace whenever he could, not only out of obligation, but out of a belief that we are more connected than divided, and that communication would bring us together.
I want his family to know his legacy lives on. I am a better nurse because of the wisdom and skills he instilled in me. I carry his light with me into every room, letting it guide and steady my hands as I heal and care for those in need.
Please honor my friend by standing up for peace, preferably with a cup of black coffee in hand and a couple of pieces of candy in your pocket, just as he would. He would remind you that caring for others is hard work, and we must do whatever it takes to get through the long shifts. Step outside with your dog, breathe in the world, hike or bike as he loved to do, and let yourself find peace in the quiet moments within nature. Stand up for justice and speak with those whose views differ from your own. Hold your beliefs with strength, but always extend love outward, even in the face of adversity.
Take one step, no matter how small, to help heal our world. Through these acts, carry his light forward in his name. Let his legacy continue to heal.”
Congress is not powerless. Democrats must unify around an actual agenda.
1. Vote no on DHS funding bill.
2. Repeal the multi-year $75 billion funding for ICE.
3. End qualified immunity for ICE agents.
4. Investigate and prosecute every single ICE agent who broke the law.
5. Impeach Noem and Bondi.
6. End the Kavanaugh stops with racial profiling and end the militarization of ICE.
7. Codify a use of force standard so courts can enforce the law against rogue ICE agents.
8. Tear down and replace ICE with an agency that has oversight.
Trump is engaged in the SYSTEMATIC destruction of the rule of law.
Only if Congress fights with every legal tool at our disposal including lawsuits in the courts, like we are doing with the Epstein files, can we stop this madness.
We owe that to nurse Pretti and the hundreds of thousands on the streets risking their lives to stand up for our freedoms.