Building societal-scale mitigations for risks from AI, especially in the next few years, is one of the most urgent problems to be working on.
The Center for AI Safety has accomplished a lot in the past 4 years including field-building initiatives and safety research, and is well-known for The Statement on AI Risk and evaluations such as Humanity's Last Exam (HLE).
Thanks to @hendrycks for bringing me on to help make AI safety go well. Excited to lead @CAIS into the next chapter!
Big news from @CAIS:
Devin Kim (formerly @xAI, @scale_AI) joins as President.
We're launching the @FrontierSecInst, a DC-based org bridging frontier AI and the National Security Enterprise.
Frontier AI is a national security technology. It's time to act like it. ⬇️
AI freely criticizes Christianity but refuses to criticize Islam.
AI companies have tried making models unbiased, but progress has been limited.
We show how to measure political bias, and we developed a new training method to reduce it.
Should we care about AI happiness? In our new research, we find evidence of functional AI wellbeing across several independent measures.
We find which AI models are happiest, how to make them happier, and even tested the effects of AI drugs. 🧵
@tszzl@Liv_Boeree it’s not intuitive that GDP and human wellbeing would become less correlated over time as humans are playing a lesser role in the economy?