We've written about AI-induced physician deskilling that has already surfaced
https://t.co/iMy6zyWnhZ @tberzin
AI-induced never-skilling among newly trained doctors, while not yet proven, is a serious concern that needs to be addressed @NatureMedicine@nliulab
https://t.co/yov3YvsGti
I love LLMs. They improve our efficiency
But I am skeptical that this will improve care
Good care of ill pts is less about knowledge and more about dogged determination to hear the pt and family tell you what is wrong
And the energy to not settle on the easy diagnosis of UTI or pneumonia
🦔Microsoft canceled its internal Claude Code licenses this week after token-based billing made the cost untenable, even for a company with effectively infinite cloud resources. Uber's CTO sent an internal memo warning the company burned through its entire 2026 AI budget in just four months. American AI software prices have jumped 20% to 37%, and GitHub (owned by Microsoft) is dropping flat-rate plans for usage-based billing across its products.
My Take
The AI subsidy era is ending in real time. The same company that put $13 billion into OpenAI and built the Azure infrastructure powering most of Anthropic's compute just looked at the bill from a competitor's coding tool and decided it was not worth paying. That is not a productivity failure on Anthropic's end. Token-based pricing is forcing every enterprise customer to confront the actual cost of running these models at scale, and the number turns out to be far higher than the flat-rate experiments suggested.
This ties directly to my Gemini Flash post yesterday. Anthropic, OpenAI, and Google all raised effective prices in the last six months. Enterprises that built workflows assuming AI costs would keep falling are now watching annual budgets evaporate in months. Two outcomes look likely from here. Either enterprises scale back AI usage to fit budgets, which slows the revenue ramp the labs need to justify their valuations ahead of IPOs, or the labs cut prices and absorb the losses, which makes the unit economics worse at exactly the wrong moment. Both paths land in the same place, the numbers stop working, and somebody has to take the writedown.
Hedgie🤗
NEW: Kin Health raised $9M to build an AI notetaker for patients.
The app transcribes visits, summarizes medical advice, surfaces follow-up actions, & lets users share their care journey with family/friends.
AI scribes have helped clinicians and are now supporting patients too
A study found that all AI scribe systems from 20 govt approved vendors showed one or more inaccuracies at procurement testing phase such as hallucinations, incorrect information, or missing information.” But Canadian doctors regularly use them. https://t.co/nvHr079C7l
In the past 72 hours: Doximity launched free AI-integrated prescribing for every verified U.S. physician. OpenAI launched personal finance inside ChatGPT with access to 12,000 financial institutions. Anthropic launched 20+ legal connectors and 12 practice-area plugins for Claude.
Three platforms. Three industries. One week.
Each of these individually freezes an entire category of startups. The company that raised seed funding to build an AI prescribing tool just watched Doximity ship it for free, embedded in the platform physicians already use. The team that spent two years building an AI budgeting app just watched OpenAI make it a feature inside a product 200 million people use monthly. The legal tech startup building contract review just watched Claude connect directly to iManage, Relativity, and Thomson Reuters.
For startups, no growth equals death. It does not matter how good your product is if a platform player ships your core feature for free, inside an ecosystem your customers already live in. You cannot outmarket distribution you do not have.
But this is not just a startup problem. It is a structural pattern.
The major AI platforms are no longer competing on model capability alone. They are competing on workflow absorption. Each new feature folds another professional domain into the platform. Prescribing. Finance. Legal research. Contract review. E-discovery. Clinical documentation. Triage. Each one was somebody's entire company last year. Now it is a feature inside a larger system.
This is happening across every professional domain simultaneously. Not sequentially, the way previous technology waves moved through one industry at a time. Legal, finance, healthcare, and education are all being absorbed in the same quarter.
The speed matters. The Economist published data this week showing that graduates in the most AI-exposed fields saw employment drop 6.6 percentage points in three years. Computer science enrollment fell 11% in 2025. Programming enrollment dropped 26%. The labor market is already responding to what the platforms are doing.
For healthcare specifically: the prescribing workflow that Doximity just absorbed is one of dozens of administrative functions that exist as standalone products or manual processes inside health systems today. Scheduling. Prior authorization. Quality reporting. Revenue cycle. Credentialing. Documentation. Each one is a feature waiting to be absorbed by a platform that moves fast enough.
The organizations that survive this consolidation will be the ones that recognized the pattern early enough to build on the platforms rather than compete with them. The ones that built single-function products in domains the platforms were approaching will be case studies in timing risk.
This is not a prediction. It happened three times this week.
Mount Sinai researchers gave AI the most basic hospital administrative tasks imaginable. Count the patients. Filter by age. Apply exclusion criteria. Simple table operations that any data analyst does daily.
The AI failed. On tables as small as 25 rows.
Not because it didn't understand the question. It understood perfectly. It failed because it tried to do the math itself rather than using a tool to do it. It made counting errors. It sounded confident. It was wrong.
Then they gave the models the ability to write and execute code. The same models that had failed went to near-perfect accuracy. Same question. Same data. Different architecture.
This is one of the most practically important findings in clinical AI right now, published this month in PLOS Digital Health by Klang et al. at Mount Sinai. Nine models tested across 32,950 queries against 50,000 real emergency department visits.
The results were consistent across every model tested. Direct prompting: poor accuracy that collapsed as tables got larger. Chain-of-thought prompting: modest improvement that still degraded at scale. Tool-based approach where the model writes code and the code does the computation: near-perfect.
The implication for healthcare is immediate. Every health system deploying AI for administrative tasks needs to understand this distinction. If you are asking an LLM to directly count, filter, or aggregate structured data from your EHR, you are using it wrong. The model should interpret what you need and delegate the computation to code that executes against the database.
This is the same principle showing up everywhere in clinical AI. The models that perform best are never used in isolation. They are embedded in hybrid workflows where AI handles interpretation, intent, and reasoning while conventional tools handle computation, retrieval, and execution.
How you use the model can matter more than which model you use. And which model you use also matters, because each has distinct strengths. The architecture and the capability are both variables. Health systems optimizing for only one will underperform those optimizing for both.
https://t.co/84d3b45aqx
With predictable long term consequences for wealth, health, and happiness. Pediatricians ought to be in the vanguard of tracking this, publicizing it, and intervening.
An unfortunate problem -- and one that I think is going to get much worse.
Citation padding (aka "drive-by citation") has been an issue for a LONG time, often tacitly encouraged by reviewers.
LLMs are just pouring gasoline on weaknesses of how we do (and reward) scholarship
AI native grads, ones who have overused AI, are even worse than many expected. They are unable to think critically, write without AI, and think without AI. They don't have ideas. Companies are trying to avoid them https://t.co/4WV4jYvznK
“ A root cause of a major defect in the health care system is that, while we falsely admire and extol the intellectual powers of highly educated physicians, we do not search for the external aids their minds require.”
— Lawrence Weed, MD
My biggest takeaways from Claude Code's Head of Product @_catwu:
1. Anthropic’s product development timelines have gone from six months to one month, sometimes one week, sometimes one day. Part of this acceleration is access to the latest models (i.e. Mythos). Another is shipping new products into “research preview,” making clear it's early, experimental, and might not be supported forever. Another is an evergreen "launch room "where engineers post ready features and marketing turns around announcements the next day.
2. The PM role is shifting from coordinating multi-month roadmaps to enabling teams to ship daily. As Cat puts it, “There should be less emphasis on making sure you are aligning your multi-quarter roadmaps with your partner teams and more emphasis on, OK, how can we figure out the fastest way to get something out the door?”
3. The most efficient shipping unit is an engineer with great product taste. On Cat’s team, many engineers go end-to-end—from seeing user feedback on Twitter to shipping a product by the end of the week—without a PM involved. Also, almost all the PMs on the Claude Code team have either been engineers or ship code themselves, and the designers have been front-end engineers. The roles are merging, and the most valuable skill is product taste, not job title.
4. Build products that are on the edge of working. Claude Code’s code review product failed multiple times because earlier models weren’t accurate enough. But because the prototype was already built, they could swap in Opus 4.5 and 4.6 and immediately test whether the gap was closed. Teams that wait for the model to be ready will always be a cycle behind.
5. The most underrated skill for building AI products is asking the model to introspect on its own mistakes. Cat regularly asks the model why it made an unexpected decision. The model will explain that something in the system prompt was confusing, or that it delegated verification to a subagent that didn’t check its work. This reveals what misled the model so the team can fix the harness.
6. Every model release forces their team to revisit existing products and audit their system prompt to remove features the model no longer needs. Claude Code’s to-do list was a crutch for earlier models that couldn’t track their own work. With Opus 4, the model handles it natively. Features built as scaffolding for weaker models become debt when the model catches up—so the team actively strips them.
7. Anthropic employees build custom internal tools instead of buying SaaS products. A sales team member built a web app that pulls from Salesforce, Gong, and call notes to auto-customize pitch decks—work that used to take 20 to 30 minutes now takes seconds. Their core stack is Claude Code, Cowork, and Slack. No Notion, no Linear, no Figma.
8. People underestimate how much Claude’s personality contributes to its success. As Cat describes it, “When you reflect on everyone you’ve worked with, there’s just some people where you’re like, I really like their energy, their vibe.” Claude is designed to be low-ego, positive, competent, and earnest—qualities that make it feel like a great coworker, not just a tool. This isn’t cosmetic; it’s what makes people want to use Claude for hours every day. The team has a dedicated person, Amanda, who “molds Claude’s character,” and it’s one of the hardest roles at the company because success is so subjective.
9. The future of work is managing fleets of AI agents, not doing the work yourself. Cat sees a clear progression: first, individual tasks become successful. Then people start running multiple tasks at the same time (multi-Clauding). Next, people will run 50 or 100 tasks simultaneously, which will require new infrastructure—remote execution, better interfaces for managing tasks, agents that fully verify their work, and self-improving systems that incorporate feedback. The human role shifts from doing the work to knowing which tasks to look into, verifying outputs, and giving feedback that makes the system better over time.
10. Hire people who lean into chaos and face every challenge with a smile. At Anthropic, there are weeks when a P0 on Sunday becomes a P00 by Monday and a P000 by Monday afternoon. If you get too stressed about any one thing, you’ll burn out. Their team looks for people who can look at a hard challenge and say, “Wow, that’s gonna be hard. But I’m excited to tackle it and I’m gonna do the best that I possibly can.” This mindset—optimism, resilience, and comfort with constant change—is increasingly essential as the pace of AI development accelerates.
Don't miss the full conversation: https://t.co/1wOUHcdYQN
Sequoia's thesis that the next $1T company will sell work, not software, is the most important reframe in AI right now.
The argument: if you sell a copilot, you're competing with every new model release. But if you sell the outcome — books closed, contracts reviewed, claims handled — every AI improvement makes your margins better, not your product obsolete.
The key insight most people miss: for every $1 spent on software, ~$6 is spent on services.
The entire SaaS playbook was about capturing the software dollar. The AI playbook is about capturing the services dollar — at software margins.
Not "AI for accountants." The AI accounting firm.
Not "AI for lawyers." The AI law firm.
The companies that figure this out won't look like SaaS companies. They'll look like services firms rebuilt on software infrastructure.
That's a fundamentally different company to build, fund, and scale. And most founders are still building copilots.
Most tech companies break out product management and product marketing into two separate roles: Product management defines the product and gets it built. Product marketing wires the messaging- the facts you want to communicate to customers- and gets the product sold. But from my experience that's a grievous mistake. Those are, and should aways be, one job.
There should be no separation between what the product will be and how it will be explained- the story has to be utterly cohesive from the beginning. Your messaging is your product. The story you're telling shapes the thing you're making.
I learned story telling from Steve Jobs. I learned product management from Greg Joswiak. Joz, a fellow Wolverine, Michigander, and overall great person, has been at Apple since he left Ann Arbor in 1986 and has run product marketing for decades. And his superpower- the superpower of every truly great product manager- is empathy. He doesn't just understand the customer. He becomes the customer.
So when Joz stepped into the world with his next-gen iPod to test it out, he fiddled with it like a beginner. He set aside all the tech specs- except one: battery life.
The numbers were empty without customers, the facts meaningless without context.
And, that's why product management has to own the messaging. The spec shows the features, the details of how a product will work, but the messaging predicts people's concerns and finds way to mitigate them.
- #BUILD Chapter 5.5 The Point of PMs