Holy moly! This is agentic workflow for medicine!
I was looking up a topic on PubMed and wondering how long it’s going to take for me to get through 4000 articles. Usually I get fed up by the 5th or 6th page -around a 100 citations burns a hole in my brain
However, I just needed this info desperately and Hermes agent came to my rescue
Set up the whole workflow to extract all 3605 records to a markdown file
I thought it would take a couple of hours
By the time I had bathed,
in 15 minutes the extracted file was on my desktop and sent to telegram
Yann LeCun was right the entire time. And generative AI might be a dead end.
For the last three years, the entire industry has been obsessed with building bigger LLMs. Trillions of parameters. Billions in compute.
The theory was simple: if you make the model big enough, it will eventually understand how the world works.
Yann LeCun said that was stupid.
He argued that generative AI is fundamentally inefficient.
When an AI predicts the next word, or generates the next pixel, it wastes massive amounts of compute on surface-level details.
It memorizes patterns instead of learning the actual physics of reality.
He proposed a different path: JEPA (Joint-Embedding Predictive Architecture).
Instead of forcing the AI to paint the world pixel by pixel, JEPA forces it to predict abstract concepts. It predicts what happens next in a compressed "thought space."
But for years, JEPA had a fatal flaw.
It suffered from "representation collapse."
Because the AI was allowed to simplify reality, it would cheat. It would simplify everything so much that a dog, a car, and a human all looked identical.
It learned nothing.
To fix it, engineers had to use insanely complex hacks, frozen encoders, and massive compute overheads.
Until today.
Researchers just dropped a paper called "LeWorldModel" (LeWM).
They completely solved the collapse problem.
They replaced the complex engineering hacks with a single, elegant mathematical regularizer.
It forces the AI's internal "thoughts" into a perfect Gaussian distribution.
The AI can no longer cheat. It is forced to understand the physical structure of reality to make its predictions.
The results completely rewrite the economics of AI.
LeWM didn't need a massive, centralized supercomputer.
It has just 15 million parameters.
It trains on a single, standard GPU in a few hours.
Yet it plans 48x faster than massive foundation world models. It intrinsically understands physics. It instantly detects impossible events.
We spent billions trying to force massive server farms to memorize the internet.
Now, a tiny model running locally on a single graphics card is actually learning how the real world works.
New on the Engineering Blog:
Building Managed Agents—our hosted service for long-running agents—meant solving an old problem in computing: how to design a system for “programs as yet unthought of.”
Read more: https://t.co/YYaEub2QGV
AI-powered development is a Rorschach test right now, and I think that comes down to three different effects of the same underlying change in the cost of building software.
Underlying cause: the fixed cost of building software and the cost/complexity curve have flattened.
Rude prompts to LLMs consistently lead to better results than polite ones 🤯
The authors found that very polite and polite tones reduced accuracy, while neutral, rude, and very rude tones improved it.
Statistical tests confirmed that the differences were significant, not random, across repeated runs.
The top score reported was 84.8% for very rude prompts and the lowest was 80.8% for very polite.
They compared their results with earlier studies and noted that older models (like GPT-3.5 and Llama-2) behaved differently, but GPT-4-based models like ChatGPT-4o show this clear reversal where harsh tone works better.
----
Paper – arxiv. org/abs/2510.04950
Paper Title: "Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper)"
"AI isn't replacing radiologists" good article
Expectation: rapid progress in image recognition AI will delete radiology jobs (e.g. as famously predicted by Geoff Hinton now almost a decade ago). Reality: radiology is doing great and is growing.
There are a lot of imo naive predictions out there on the imminent impact of AI on the job market. E.g. a ~year ago, I was asked by someone who should know better if I think there will be any software engineers still today. (Spoiler: I think we're going to make it). This is happening too broadly.
The post goes into detail on why it's not that simple, using the example of radiology:
- the benchmarks are nowhere near broad enough to reflect actual, real scenarios.
- the job is a lot more multifaceted than just image recognition.
- deployment realities: regulatory, insurance and liability, diffusion and institutional inertia.
- Jevons paradox: if radiologists are sped up via AI as a tool, a lot more demand shows up.
I will say that radiology was imo not among the best examples to pick on in 2016 - it's too multi-faceted, too high risk, too regulated. When looking for jobs that will change a lot due to AI on shorter time scales, I'd look in other places - jobs that look like repetition of one rote task, each task being relatively independent, closed (not requiring too much context), short (in time), forgiving (the cost of mistake is low), and of course automatable giving current (and digital) capability. Even then, I'd expect to see AI adopted as a tool at first, where jobs change and refactor (e.g. more monitoring or supervising than manual doing, etc). Maybe coming up, we'll find better and broader set of examples of how this is all playing out across the industry.
About 6 months ago, I was also asked to vote if we will have less or more software engineers in 5 years. Exercise left for the reader.
Full post (the whole The Works in Progress Newsletter is quite good):
https://t.co/ON3GwlI3mi
I find critical thinking and nuance important, but it kinda sucks that people started wielding it as an aesthetic to cover for equivocation
like, nuance can often lead you to strongly held positions, something you'd know if you *actually* engaged in critical thinking https://t.co/KkxLCZKhoX
Why do C programmers always obfuscate their code? Are they trying to save space? Do they have to pay for each letter? Are they using some trial version of GCC that doesn't allow actual words in variable names?