My team at @asana worked with @AnthropicAI to test Claude 3.5 Sonnet before its release – and we were very impressed! Read my new #AsanaEngineering blog post to find out what we learned, and how Asana QAs new #LLMs. https://t.co/wIF8oESXfb
Scheduled opposite Boris and Jared on the main stage at Code with Claude today. Come to my breakout anyway? I'll be showing what we've built with Asana AI Teammates 🙏 - I promise it will be 🤯
was messing with the OpenAI base URL in Cursor and caught this
accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast
so composer 2 is just Kimi K2.5 with RL
at least rename the model ID
Doing an AMA on r/Asana next Thursday (March 26, 12 – 1pm PT) about @Asana AI Teammates with my product counterpart Nik Greenberg! Bring questions.
https://t.co/n7JEHMF1c1
Congrats to @AnthropicAI on Claude Opus 4.6! We tested it early at @Asana – it's the best model we've seen yet. Already powering our AI Teammates internally. The pace of improvement is wild 🚀
https://t.co/v2bQeoOAQL
Fascinating to see how heavily OpenAI has optimized Sora2 (either by training or prompting) for meme-ability. Whatever you put in it leans toward making a meme.
Sora2 is the strongest sign yet that OpenAI has become a product company with a great AI lab attached. The tech is impressive, but the product experience is front-and-center, and seems to be what they expect to drive usage.
Anyway here's a rave raccoon: https://t.co/84wCdEovTg
We've been creating and adapting various context engineering techniques to the Asana work graph since December – and now we're sharing some of what we've learned.
Here's how we power AI Studio and make our agents more effective: https://t.co/qaACGaBTnI
As a neurosurgeon I care a lot about road safety.
By now you’ve probably seen @Waymo’s stunning safety results (like 91% fewer serious crashes). But they didn’t just publish data headlines. They released the raw CSV files and data dictionaries.
I did a much deeper analysis. A fascinating story emerges when you analyze how they’re achieving this.
This isn’t incremental improvement - it’s categorical. We’re looking at the potential elimination of traffic deaths as a leading cause of mortality.
The intersection breakthrough: Waymo has essentially solved intersection crashes, with 95% fewer injury incidents than human drivers in the same locations. That’s transforming the deadliest driving scenario.
The national math: If every US vehicle performed like Waymo, we’d prevent 33,000-39,000 deaths annually and save $0.9-1.25 trillion in societal costs. Even partial adoption at 27% would save ~10,000 lives per year. In terms of magnitude, this would be the equivalent of eliminating every pedestrian death nationally in a year.
The physics signature: Here’s what fascinates me: 47% of Waymo’s contacts involve less than 1 mph delta-V. They’re not just avoiding crashes; they’re converting unavoidable incidents into gentle bumps. It’s like having physics itself on your side.
We’re not talking about marginal safety gains. The data represents a fundamental shift from harm reduction to harm prevention.
The methodology matters: I used their dynamic geographic benchmarks (comparing like-for-like road conditions) and verified the findings hold across San Francisco, Phoenix, LA, and Austin. The safety advantage actually increases in more complex urban environments.
Link to raw data below….
Notes on my approach:
Analysis based on 96 million miles of Waymo Rider-Only (RO) data through June 2025, utilizing Waymo's dynamic geographic benchmarks to compare Waymo Driver performance against human drivers under similar road conditions and operational design domains.
The projections for national impact (deaths prevented, societal costs) involve several assumptions. Given Waymo's zero reported fatalities, the direct serious injury reductions were mapped to national fatality statistics using established NHTSA-derived ratios that correlate serious injury crash rates with fatality rates. This extrapolation assumes that Waymo's observed serious injury prevention capability would translate proportionally to fatality prevention. Societal cost savings are estimated by applying average per-fatality and per-injury economic costs (e.g., medical, lost productivity, quality of life) as published by NHTSA, scaling these national averages to the projected number of avoided fatalities and injuries based on Waymo's safety performance. These figures represent the potential annual impact if the Waymo Driver's safety profile were widely integrated into the national fleet.
@ethanteicher
Thinking about reactions to GPT-5 over time. At launch the prevailing sentiment was underwhelm, but now people have kicked the tires and look on it quite favorably. My guess is that as models become more advanced, we'll need longer time horizons to judge their capabilities.
@swyx I was much more concerned with evals a year ago than I am now. The reality is that as frontier models have progressed to be great at most of the tasks we throw at them, the utility of evals has decreased.
@rowancheung Made one two days ago that prevents me from using social media until I meditate, working wonders to get my meditation habit back on track.
I quite like how well @arcprize shows the distribution of GPT-5 variant capabilities, from 1.5% (GPT-5 Nano, Minimal) to 65.7% (GPT-5 High).
Some other things that seem interesting:
- 'Thinking' really matters for GPT-5: 6% for 'Minimal' to 65.7% for 'High'. The difference is massive.
- The 'Minimal' setting is generally quite poor: GPT-5 Nano 'Medium' is much higher than GPT-5 'Minimal'.
- GPT-5 Mini (High) is similar to GPT-5 (Medium), but at about half the price.
- Nano models are quite far behind, but they’re nearly free.
Obviously I am biased and use @asana way more than most knowledge workers, but our AI Workflows product (which will be released more widely soon) has totally changed the way I work.