Big progress vs cancer, folks.
The kind of event curves from randomized trials that we've not seen before for a couple of the most deadly cancers. Congrats to the oncology research community for getting these trial done. #ASCO26, @ASCO
Put another way, its world model more prefers consensus (with a lag) over "truth". How would it behave in 2020?
Feb: There is no pandemic -> experts say not to panic
Mar: Chides and sabotages your attempt to model aerosolized spread
June: Training data recommends against masking
Opus 4.8 is basically unusable for me due to this pattern. It may indicate deeper flaws - e.g. its deference to authority of low-quality search results is borderline data/retrieval poisoning. Then it continues to overrule my instructions despite explicit rules in CLAUDE.md
@TheZvi Opus 4.8: You are attributing bad acts to a respected actor.
Me: I am extremely aware of that, yes.
Opus: But you haven't proved they did the bad acts.
Me: I am confused. The piece includes voluminous evidence, including a senior executive admitting to the bad acts on letterhead.
The honesty and anti-sycophancy changes might be beneficial to the median user, but are absurdly disruptive to anyone working in fast-moving fields or relying on non-public information.
One of the most important and under appreciated trends in the world right now.
1. 100s of billions of dollars will soon be available to solve big problems (making the world resilient to ASI, ending factory farming, etc).
2. The projects and organizations which will turn billions of 2027/28 dollars into impact need to be started NOW.
3. We need really talented people to start and run and work for these new projects. What @nanransohoff calls general managers, who feel personally resposible for solving one of the world’s important problems.
What is especially scarce are detailed visions about what making AI go well looks like. These will help inform what problems these new projects ought to work on.
@Upper20sStCap Midterm elections aren’t that far away, the House of Reps calling Jensen to testify is real possibility; Jensen being on AF1 puts him closer to the controversy-more liability, less shield; and the enforcement surface area is way larger than just WH export controls
@CJHandmer EM railguns developed thus far can’t handle sustained fire without significant degradation of the launch rails, does a lunar mass driver require novel materials science advances?
An underrated feature of this situation: a private company now has incredibly powerful zero-day exploits of almost every software project you've heard of.
And Hegseth and Emil Michael have ordered the government not to in any capacity work with Anthropic.
Big deal paper here: field experiment on 515 startups, half shown case studies of how startups are successfully using AI.
Those firms used AI 44% more, had 1.9x higher revenue, needed 39% less capital:
1) AI accelerates businesses
2) The challenge is understanding how to use it
Cool project: the DC Waymo delay dashboard tracks how many DC residents are dead because the mayor and city council keep demanding studies instead of allowing Waymo: https://t.co/pnPTTU1NpN
@austinc3301@AndyMasley@KelseyTuoc I think they just released a new version of the website with substantially reduced downward assumptions - more accurate
@AndyMasley Neat! I wonder if a diverging bar chart or waterfall chart would better highlight the imbalance in favor of systemic actions. Opposed layout > colors. Also Claude Code would be great at generating a socials-friendly share card w visual highlighting eg "125x your personal cuts"
A few people asked me to make an image of how much irrigated farmland would use the same water required for ALL ChatGPT usage, including every part of the process. I did a botec and my best guess right now is that inference uses about as much water as training, and power generation uses ~5x as much water as the data centers themselves, so it looks something like this. The water cost of manufacturing chips is marginal compared to how much water they use over their lifetimes.
@TheZvi Gaming out further what it’d mean for admin to maintain maximalist SCR stance. Would SCR lawsuits lead to reversal in days or months? Are compliance depts going to sever Anthropic ties in the face of threats even if letter of law more limited? Model $$$ at risk etc
@peterwildeford I am having trouble reasoning through all the conditionals and second order effects in a scenario where DoD doesn’t back down from a maximalist SCR position. That could be an interesting exploration.
@moultano King Gizzard & the Lizard Wizard - Infest the Rats Nest album perhaps worth a mention. Unfortunately they recently pulled their work off spotify.