Emergency Physician - East Texas, home is NYC, Tour Medical Dir - Houston, Dallas Symphonies, Ex-Regional Director Envision, Juilliard-trained Violinist, RE/EVs
@jselanikio It’s so clear to me that medicine will shrink and hospitals will go out of business. Almost impossible to imagine. But I have absolutely no doubt.
Just the latest study; will decimate orthopedic profitable procedures - when scaled:
Use of GLP-1 drugs and reduction of the need for knee replacement. This would be expected from the impact of weight loss effect, but another study in osteoarthritis found benefit independent of weight loss, attributed to GLP-1 anti-inflammatory action and cartilage protection
https://t.co/mPvgkNoNYc
Record ED volume, but local factors at play—including migration patterns, etc.—impact it. That’s why we really need contemporaneous national and international data.
By the time we see that data, snowballing is likely underway.
Looking for multifocal signal. X posts about decreasing staffing due to lower volume; have yet to see even those. Though maybe something people don’t publicize.
Agree.
Just trying to translate nuance. If that’s even possible!
Most administrators won’t be in their positions a decade from now when the shift is likely to be massive.
In the short term, the ebbs and flows are hard to model as from the clinical side, it can seem like we are constantly inundated.
There is a velocity mismatch to contend with.
Previous transitions gave workers decades to retrain. AI capability is advancing faster than humans can retrain, often faster than they can even make the decision to retrain.
A worker who begins a two-year retraining program today is retraining for a capability landscape that will have shifted multiple times before they finish.
There is a fallacy about the Luddite Fallacy, and almost nobody articulates it.
The original Luddites smashed looms because they believed machines would destroy all work (and really, that they would not share the spoils). They were wrong. The industrial revolution displaced specific jobs and created new industries. Employment shifted. Society adapted over decades.
The lesson economists drew: technology displacement fears are always wrong. The Luddites were wrong, therefore anyone worried about AI displacement is wrong.
That reasoning has a name. It is called the Luddite Fallacy. And treating it as a universal law rather than a historical observation about a specific kind of transition is its own fallacy.
Here is why the pattern may not hold.
Every previous technological revolution replaced human muscle. The plow, the loom, the engine, the assembly line. Each one automated physical labor and pushed workers up into cognitive labor. There was always a higher tier to retreat to.
AI replaces cognition. There is no obvious higher tier above it for the broad workforce to move into.
Wassily Leontief, Nobel laureate in economics, made this point in 1983 using the analogy of the horse. When early machines arrived, horses did not lose their jobs. Their work shifted from fields to cities. The car did not make the horse's job easier. It eliminated the horse from the economy entirely. The U.S. horse population peaked around 1915 and collapsed over the following decades. Horses could not move up the cognitive ladder. They were the labor.
The "co-pilot" framing in healthcare is worth examining through this lens. AI will augment physicians, not replace them. AI will handle the boring parts so clinicians can focus on the human connection. These statements may be true in the short term. But they describe a transitional phase, not a permanent architecture.
Volume expansion comes first. Workflow compression follows. Role redefinition follows that. That is not replacement. But it is not the co-pilot narrative either. It is a structural transformation that the co-pilot framing is not designed to describe.
New jobs will emerge. They always do. But the assumption that they will emerge at the same rate, at the same scale, and accessible to the same people whose jobs were displaced is not a law of economics. It is an extrapolation from transitions that operated on a fundamentally different substrate.
The pattern matching to the original Luddites only works if the pattern actually matches. This time, the substrate is different. The speed is different. The breadth is different. Whether the outcome is different deserves more honesty than either side is currently offering.
The implications are larger than the post lets on.
Korea runs one of the world’s best national hypertension programs. Control rate near 59 percent as of 2022. Better than the US, better than most. And even there, roughly 40 percent remain uncontrolled.
Korea’s most prescribed antihypertensive is the ARB, with dual therapy more common than monotherapy. The best-controlled country already leans on the exact regimens JAMA just flagged as best tolerated.
The drugs are not the variable. Execution is.
Find the uncontrolled. Start therapy. Intensify on schedule. Distinguish real drug intolerance from background symptoms using randomized data. Track refill gaps. Prevent drift.
That is a task description for software. Not replacing clinical judgment at the edges, but ensuring the routine protocol actually runs for every patient, every time.
AI does not need to solve resistant hypertension to transform outcomes here. It needs to close the execution gap on the majority of cases that are straightforward and currently failing because nobody followed up.
Four in ten uncontrolled in the best system on earth, and far more everywhere else. That is not pharmacology. It is workflow, and it will not be fixed by the same workflow.
This post was written with Opus 4.6, 4.8
Oversight by GPT5.5-Thinking Extended
Insights are mine.
Hypertension is the most common chronic condition in medicine. It is also one of the most treatable. Five drug classes. Decades of outcome data. Generics that cost pennies.
And still, roughly half of adults with hypertension never reach control.
The reasons are not mysterious. Clinicians hesitate to start therapy. They hesitate to combine. They hesitate to intensify. And patients stop, sometimes because a drug caused a symptom, often because a symptom was simply misattributed to the drug.
A network meta-analysis published in JAMA on May 28 puts hard numbers on that problem. 716 double-blind randomized trials. 159,362 participants. Treatment discontinuation due to adverse events as the measure of real tolerability.
On the outcome that actually matters, stopping the drug because of a side effect, ARB monotherapy and ARB plus calcium channel blocker beat placebo. Four of the top five best-tolerated regimens contained an ARB. Several combinations were better tolerated than some common monotherapies.
Now sit with what the routine task actually is. Confirm the diagnosis. Pick from five well-characterized classes. Apply combination rules with known interaction profiles. Distinguish a drug effect from a coincidence using randomized data the prescriber has usually never read. Adjust at intervals. Repeat.
There is real judgment at the edges. CKD, pregnancy, frailty, resistant disease. But for most routine hypertension care, the problem is not rare diagnostic brilliance. It is consistent execution.
That is close to the ideal substrate for algorithmic support.
When this comes up, senior clinicians reach for sophisticated-sounding objections. The art of medicine. The unmeasurable patient in front of you. The danger of cookbook care. Those are real concerns at the edges. They do not explain why a patient sits at 158 over 100 across three visits with no change to the regimen.
The cognition was rarely the hard part. The consistency was. Following the rule every time, for every patient, without drift. That is precisely where well-designed software outperforms human memory and habit.
The drugs exist. They work. Several are better tolerated than clinicians assume. The failure is not pharmacology. It is everything between the prescription and the refill, and that gap is one of the most automatable failure points in chronic disease.
💬Editorial: Among people with #hypertension, adherence to blood pressure–lowering drugs remains suboptimal, often due to adverse effects such as fatigue, dizziness, and edema. Prescribing better-tolerated regimens like ARB combinations may help improve medication persistence.
https://t.co/ZcwF9b570n
Hypertension is the most common chronic condition in medicine. It is also one of the most treatable. Five drug classes. Decades of outcome data. Generics that cost pennies.
And still, roughly half of adults with hypertension never reach control.
The reasons are not mysterious. Clinicians hesitate to start therapy. They hesitate to combine. They hesitate to intensify. And patients stop, sometimes because a drug caused a symptom, often because a symptom was simply misattributed to the drug.
A network meta-analysis published in JAMA on May 28 puts hard numbers on that problem. 716 double-blind randomized trials. 159,362 participants. Treatment discontinuation due to adverse events as the measure of real tolerability.
On the outcome that actually matters, stopping the drug because of a side effect, ARB monotherapy and ARB plus calcium channel blocker beat placebo. Four of the top five best-tolerated regimens contained an ARB. Several combinations were better tolerated than some common monotherapies.
Now sit with what the routine task actually is. Confirm the diagnosis. Pick from five well-characterized classes. Apply combination rules with known interaction profiles. Distinguish a drug effect from a coincidence using randomized data the prescriber has usually never read. Adjust at intervals. Repeat.
There is real judgment at the edges. CKD, pregnancy, frailty, resistant disease. But for most routine hypertension care, the problem is not rare diagnostic brilliance. It is consistent execution.
That is close to the ideal substrate for algorithmic support.
When this comes up, senior clinicians reach for sophisticated-sounding objections. The art of medicine. The unmeasurable patient in front of you. The danger of cookbook care. Those are real concerns at the edges. They do not explain why a patient sits at 158 over 100 across three visits with no change to the regimen.
The cognition was rarely the hard part. The consistency was. Following the rule every time, for every patient, without drift. That is precisely where well-designed software outperforms human memory and habit.
The drugs exist. They work. Several are better tolerated than clinicians assume. The failure is not pharmacology. It is everything between the prescription and the refill, and that gap is one of the most automatable failure points in chronic disease.
You obviously have a vastly larger impact than that.
That being said, in my neck of the woods, pre-hospital care does much of the saving these days.
For example, in the past some of the true saves would be patients with acute pulmonary edema awoken at 5am severely dyspneic and hypoxic. Those would feel like true saves.
But now EMS gets them on bipap, nitro and they are often no longer in distress when they arrive.
But pretty sure I’m saving more than 1 patient every 10 years. And no illusions here. Lots of patients come in for upper respiratory infections or bronchitis and would do fine without my care.
100%. Even just medical care - we haven’t even begun yet. When models can diagnose dermatological rashes, manage simple conditions, use globally with go through the roof both from people who never had access, and people who have access and want their daily questions answered. People have no idea what’s about to happen. Ask a physician how many questions their spouse asks when they have access. Now model this out to the entire population.
Everyone in healthcare is waiting for AI to get smarter before they deploy it. That is the wrong thing to wait for.
For a large class of administrative tasks, the capability has arrived. If a task is self-contained, the data lives in one place, in a readable form, with a clear rule set and a checkable output, current frontier models perform at the level where close human oversight is the only thing required.
Assigning an E/M level from a completed clinical note. Drafting a denial appeal from the chart. Abstracting a quality metric from structured fields. Verifying a credentialing document against a checklist. For these, the model is not the limiting factor. A smarter model makes them cheaper. It does not unlock a capability that is missing today.
So why does deployment lag so far behind capability?
Because almost no hospital task is actually self-contained in practice.
Assemble this prior authorization packet. Sounds simple. It actually requires pulling data from six systems. Three have no clean way to connect. One is a fax. The payer rule is written in one place and applied differently in another. Half the context a human uses to complete the task was never written down. It lives in the head of someone who has done the job for fifteen years.
The model could do the reasoning instantly if it could see the inputs. It cannot see the inputs. That is not an intelligence problem. It is a plumbing problem.
The frontier worth watching is not a bigger benchmark score. It is whether the messy operational reality of a hospital can be made legible to a model that is already capable enough.
The hospitals that move first will not be the ones waiting for the smartest model on top of broken data infrastructure. They will be the ones connecting their systems, normalizing their data, and building the access layer for the capability that already exists.
The models are already good enough. The pipes are not.
Anthropic released Opus 4.8 today..
Four frontier labs. Look at how tightly they cluster.
Agentic coding: 69.2, 64.3, 58.6, 54.2. Agentic computer use: 83.4, 82.8, 78.7, 76.2. Financial analysis: 53.9, 51.5, 51.8, 43.0. Multidisciplinary reasoning without tools: all four sit between 41 and 50.
The lead changes by the task. Opus 4.8 tops most columns. But GPT-5.5 beats it on terminal coding, 78.2 to 74.6. There is no single best model. There is a frontier, and four labs are crowded onto it within a few points of each other.
Two things matter for anyone planning around this technology.
First, the whole cluster moved up again in a single release cycle. Not one model. The frontier. This is not the first time this year that a new release stepped the entire field forward within weeks of the last one. The pattern is not slowing.
Second, the competition is the accelerant. When four labs are this close, each release forces the others. No one gets to rest on a lead because the lead lasts weeks. That dynamic does not produce a plateau. It produces exactly what the scoreboard shows: relentless, compounding, broad-based improvement.
For health systems: any cost-benefit analysis, any build-versus-wait decision, any pilot scoped to today's capability is working against a target that moves every few weeks. The model you evaluate this quarter is not the model you deploy next quarter. It will be cheaper and more capable. So will three competitors.
The multidisciplinary reasoning numbers are the ones I watch most. That capability is closest to clinical work, and it is climbing across every lab at once.
The entire frontier took another step today. All four labs. Same month.
Written by Opus 4.8 with Opus 4.6 oversight.
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
@alojohhardcore@stevenleebeyer1@jonathanmui The financial media with a more widespread audience, in my opinion, provides just as bad information and much hype. Even the WSJ, FT, etc….
@alojohhardcore@stevenleebeyer1@jonathanmui The ironic thing is, AJ, that the fact that those other accounts are so popular with such flawed information, actually provides even more asymmetry.
So, we might complain about them, but they actually help those who want to do better.