I have decided to focus my PhD work on preventing catastrophic outcomes from the development of AI systems. Over the past few months, I've had a number of conversations in which I described my reasoning for this choice. To share my thinking more broadly, I wrote a blogpost laying it out.
The TL;DR:
- The capabilities of AI systems are developing fast, and it seems at least plausible that we will get systems that match or vastly exceed human capabilities within the next 10-15 years.
- If we get there, the consequences would likely be massive. It could accelerate progress on important problems, such as curing diseases, tackling climate change, and improving education, but it could also cause significant harm to society.
- In particular, many experts take the possibility seriously that our society could end up permanently locked into a very bad state (e.g., extinction, concentration of power in the hands of a few, loss of control to a powerful AI system).
- Unfortunately, I do not think there are nearly enough people working directly on preventing such catastrophic outcomes. Therefore, I've made reducing these risks the focus of my PhD.
You can check out the full blogpost here: https://t.co/ObM3tnH4IL
I have decided to focus my PhD work on preventing catastrophic outcomes from the development of AI systems. Over the past few months, I've had a number of conversations in which I described my reasoning for this choice. To share my thinking more broadly, I wrote a blogpost laying it out.
The TL;DR:
- The capabilities of AI systems are developing fast, and it seems at least plausible that we will get systems that match or vastly exceed human capabilities within the next 10-15 years.
- If we get there, the consequences would likely be massive. It could accelerate progress on important problems, such as curing diseases, tackling climate change, and improving education, but it could also cause significant harm to society.
- In particular, many experts take the possibility seriously that our society could end up permanently locked into a very bad state (e.g., extinction, concentration of power in the hands of a few, loss of control to a powerful AI system).
- Unfortunately, I do not think there are nearly enough people working directly on preventing such catastrophic outcomes. Therefore, I've made reducing these risks the focus of my PhD.
You can check out the full blogpost here: https://t.co/ObM3tnH4IL
I'm not claiming today's models are very dangerous. My point is that it's plausible future models will be extremely dangerous, and that we're not prepared for it. I think people should start thinking about this now, rather than once models have already become significantly more dangerous.
Firstly, the argument in my blogpost doesn't require RSI.
On RSI: I am not confident we will definitely get there, but it seems at least plausible (based on extrapolating trendlines, strong qualitative arguments, and the views of many experts). At the same time, I don't see a strong case for why it's extremely unlikely. Given the effects this would have, I think it's important to take it very seriously.
If you want to research interventions to gradual disempowerment or the intelligence curse, @LRudL_ and I are mentoring a @MATSprogram stream this autumn.
Many people have asked me “what’s the plan to make this go well?” Right now, there’s not one. You should help fix that.
🧵
We’re launching @JudgmentLabs today and announcing $32M in funding.
As AI agents take on more of the work that creates economic value, they generate massive amounts of production data: the clearest record of how they behave with users, software, and the real world.
Judgment builds infrastructure for improving AI agents from production data.
Going to be attending ICLR in Brazil this week to present our work on Energy-Weighted Flow Matching at the DeLTa workshop. We reformulate flow matching via importance sampling to train CNFs for Boltzmann sampling from energy evaluations.
Happy to connect! 🇧🇷
Stoked to be heading to ICLR in Rio next week to present 3 papers with my brilliant colleagues 🌊🇧🇷
If you're around, come say hi, always up for a coffee or catching some waves 🤙 #ICLR
Love it! I do something similar, but go a step further and have a single Obsidian vault in which I also save context on current projects, do my writing, and save preferences/context over time: https://t.co/qGGZj3am0e
I actually rarely use the web UI interfaces of the LLM providers anymore.
Over the last few months, one of the main ways I've used AI is by running Claude Code sessions on top of an Obsidian vault. I find the setup pretty useful, so I wrote a short blogpost about how it works. 🧵
Lastly, I run a continuous session where I add fine-grained permissions to the settings, so that more and more actions run without me having to approve them (this is even more useful for my coding setup).
There is something inherently wrong with this world. How can there be so many things I'd love to dedicate years of my life to learning and exploring, and still my days only have 24 hours and the number of days I have is limited. Wtf