The world is hitting tank‑bottom oil inventories by September.
Oil inventories work like blood in the human body: you can donate a bit, but below a certain level your blood pressure tanks and organs start failing. You don’t die because blood hits zero, you die because circulation collapses
That’s where we are with oil. We began the year with more than 8 billion barrels in storage, yet only around 10% was actually usable without pushing the system into stress... that safety margin has now been drawn down.
The next stage is dropping to operational floor levels, where pipelines and refineries start failing – that’s the real tank bottom.
@tadfriend Last but not least, a piece that just hits the Zeitgeist of people turning more anxious of their AI overlord masters and the people who birthed them
https://t.co/FcLDUGParh
Resubscribed to the New Yorker a while back because I just like the reporting too much, despite some bias - couple ones I found particularly worthwhile to read:
https://t.co/77T0cN7Po6
by @ddk_nyc
Another example of if you just report it and describe it, people should really get the craziness (and perhaps the rip-off nature) of it - California edition
https://t.co/AN2APdsgcU
by @tadfriend
I spent 100 hours over the past week researching, writing and editing the piece we just put out.
It’s a scenario, not a prediction like most of our work. But it was rigorously constructed, dismissing it outright requires the kind of intellectual laziness that tends to get expensive.
And we’ve released it for free. Hopefully you enjoy it.
https://t.co/YK8E11GcDU
NATO: "We are fucked!"
10 Ukrainian soldiers eliminated 2 NATO battalions in a half a day of training.
So, the myth of all-powerful NATO that can stop Russia is no more.
WSJ: NATO forces were horrible and wiped out in a 16,000-troop drill in Estonia. 1/
Today will go down as some kind of turning point. Somewhat arbitrarily, but it is OK if journalists and historians have to present things in that manner.
My bio says I work on AGI preparedness, so I want to clarify:
We are not prepared.
Over the last year, dangerous capability evaluations have moved into a state where it's difficult to find any Q&A benchmark that models don't saturate. Work has had to shift toward measures that are either much more finger-to-the-wind (quick surveys of researchers about real-world use) or much more capital- and time-intensive (randomized controlled "uplift studies").
Broadly, it's becoming a stretch to rule out any threat model using Q&A benchmarks as a proxy. Everyone is experimenting with new methods for detecting when meaningful capability thresholds are crossed, but the water might boil before we can get the thermometer in. The situation is similar for agent benchmarks: our ability to measure capability is rapidly falling behind the pace of capability itself (look at the confidence intervals on METR's time-horizon measurements), although these haven't yet saturated.
And what happens if we concede that it's difficult to "rule out" these risks? Does society wait to take action until we can "rule them in" by showing they are end-to-end clearly realizable?
Furthermore, what would "taking action" even mean if we decide the risk is imminent and real? Every American developer faces the problem that if it unilaterally halts development, or even simply implements costly mitigations, it has reason to believe that a less-cautious competitor will not take the same actions and instead benefit. From a private company's perspective, it isn't clear that taking drastic action to mitigate risk unilaterally (like fully halting development of more advanced models) accomplishes anything productive unless there's a decent chance the government steps in or the action is near-universal. And even if the US government helps solve the collective action problem (if indeed it *is* a collective action problem) in the US, what about Chinese companies?
At minimum, I think developers need to keep collecting evidence about risky and destabilizing model properties (chem-bio, cyber, recursive self-improvement, sycophancy) and reporting this information publicly, so the rest of society can see what world we're heading into and can decide how it wants to react. The rest of society, and companies themselves, should also spend more effort thinking creatively about how to use technology to harden society against the risks AI might pose.
This is hard, and I don't know the right answers. My impression is that the companies developing AI don't know the right answers either. While it's possible for an individual, or a species, to not understand how an experience will affect them and yet "be prepared" for the experience in the sense of having built the tools and experience to ensure they'll respond effectively, I'm not sure that's the position we're in. I hope we land on better answers soon.