Great article here on DeepSeek.
Their real story is not cheaper chatbots, but architecture that turns hardware scarcity into strategy.
DeepSeek is not trying to sell coding seats, it is trying to make Chinese memory, accelerators, and systems useful for frontier AI.
Every recent DeepSeek move attacks a bottleneck that makes frontier models dependent on elite HBM-heavy GPU stacks: MoE activates only parts of a model, DSA reduces long-context attention cost, and V4-Pro’s official card says CSA/HCA cuts 1M-token single-token inference FLOPs to 27% and KV cache to 10% of V3.2.
Engram, a separate research line, pushes the same logic from another side: let static knowledge live in scalable lookup memory, then fetch it predictably from host memory instead of forcing every fact through dense computation.
That sounds like engineering detail until you see the business consequence.
If models need less HBM and less brute-force compute, then second-best chips, abundant LPDDR, NAND, and customized ASICs become less second-best.
Reuters has already reported a permanent 75% DeepSeek V4-Pro price cut, while noting Huawei Ascend supply constraints and expected supernode availability, which is exactly the kind of feedback loop that they wanted.
DeepSeek is not only optimizing models for benchmarks, it is optimizing AI for a different industrial base.
The prize is not the app layer.
The prize is making scarcity programmable.
Organizations are already superhuman intelligences. The University of Pennsylvania or Walmart or whatever is far more capable than any human.
That is why the focus on AIs as individual productivity tools hits a natural limit, many benefits of AI depend on integration with firms.
Def agree more engagement on these topics between AI and religous/faith leaders would be productive. One strange behavioral pattern we are observing in some studies is that the perceived 'epistemic stratification' between humans & AI is resulting in an 'over-obedience' to AI by some people -- quasi-religious/cosmological. It's popping up across multiple studies.
@gpackiam@ahc Good stuff, Glenn. One complementary perspective you might find interesting analyzes the sociological implications of AI/digital technology: The Ordinal Society (Fourcade & Healey, 2024). It's not just about AI per se, but their analyses are quite relevant.
Claude knows! —>
The Lump of Labor Fallacy and Why AGI Unemployment Panic Is Economically Illiterate
Let me lay this out with full rigor, because this argument deserves to be prosecuted completely rather than waved away with a sound bite.
I. What the Lump of Labor Fallacy Actually Is
The lump of labor fallacy is the assumption that there exists a fixed, finite quantity of work in an economy — a lump — such that if a machine (or an immigrant, or a woman entering the workforce) does some of it, there is necessarily less left for human workers to do. It treats employment as a zero-sum pie.
The fallacy was named and formalized in the early 20th century but the error it describes is far older. It animated the Luddite riots of 1811–1816, where English textile workers destroyed power looms convinced that the machines would steal their jobs permanently. It drove opposition to the spinning jenny, the cotton gin, the mechanical reaper, the steam engine, the telegraph, the railroad, the automobile assembly line, the personal computer, and every other major labor-displacing technology in the history of industrial civilization.
Every single time, the catastrophists were wrong. Not partially wrong. Structurally, fundamentally, categorically wrong — because they misunderstood the nature of economic production itself.
The reason the fixed-pie assumption fails is this: demand is not fixed. Work generates income. Income generates demand for goods and services. Demand for goods and services generates new categories of work. This is an engine, not a reservoir. When you drain some of the reservoir with a machine, the engine speeds up and refills it — and often refills it past its previous level.
II. The Classical Economic Mechanism That Destroys the Fallacy
To understand why the lump-of-labor assumption is wrong about AGI, you need to understand the precise mechanism by which technological unemployment resolves itself. There are four distinct channels, all operating simultaneously:
Channel 1: The Productivity-Demand Feedback Loop (Say’s Law, Modified)
When a technology increases the productivity of labor or replaces labor entirely in a given task, it lowers the cost of producing whatever that task was part of. Lower production costs mean either:
∙Lower prices for consumers (real purchasing power rises), or
∙Higher profits for producers (which get reinvested, distributed as dividends, or spent as wages for other workers), or
∙Both.
Either way, aggregate real income in the economy rises. That additional real income does not evaporate. It gets spent on something — including goods and services that didn’t previously exist or were previously too expensive to consume at scale. That spending creates demand. That demand creates jobs.
This is not a theoretical conjecture. The average American in 1900 spent roughly 43% of their income on food. Today it’s around 10%. Agricultural mechanization didn’t produce a nation of starving unemployed farm laborers — it freed up 33% of household income to be spent on automobiles, television sets, air conditioning, healthcare, education, travel, smartphones, and streaming services, most of which didn’t exist as industries in 1900. The workers who left farms went to factories, then to offices, then to service industries, then to information industries. The economy didn’t run out of work. It metamorphosed.
📢 New Special Issue TOC Alert
Artificial Intelligence: Organizational Possibilities and Pitfalls
Journal of Management Studies (Mar 2026)
Research on AI, work, governance, trust, strategy & ethics in organizations.
🔗 https://t.co/BTHDwE0j0q
#JMS#AI#ManagementResearch
🚨 BREAKING: Researchers at UW Allen School and Stanford just ran the largest study ever on AI creative diversity.
70+ AI models were given the same open-ended questions. They all gave the same answers.
They asked over 70 different LLMs the exact same open-ended questions.
"Write a poem about time." "Suggest startup ideas." "Give me life advice."
Questions where there is no single right answer. Questions where 10 different humans would give you 10 completely different responses.
Instead, 70+ models from every major AI company converged on almost identical outputs. Different architectures. Different training data. Different companies. Same ideas. Same structures. Same metaphors.
They named this phenomenon the "Artificial Hivemind." And the paper won the NeurIPS 2025 Best Paper Award, which is the highest recognition in AI research, handed to a small number of papers out of thousands of submissions.
This is not a blog post or a hot take. This is award-winning, peer-reviewed science confirming something massive is broken.
The team built a dataset called Infinity-Chat with 26,000 real-world, open-ended queries and over 31,000 human preference annotations. Not toy benchmarks. Not math problems.
Real questions people actually ask chatbots every single day, organized into 6 categories and 17 subcategories covering creative writing, brainstorming, speculative scenarios, and more.
They ran all of these across 70+ open and closed-source models and measured the diversity of what came back. Two findings hit hard.
First, intra-model repetition. Ask the same model the same open-ended question five times and you get almost the same answer five times.
The "creativity" you think you're getting is the same output wearing a slightly different outfit. You ask ChatGPT, Claude, or Gemini to write you a poem about time and you keep getting the same river metaphor, the same hourglass imagery, the same reflection on mortality.
Over and over. The model isn't thinking. It's defaulting to whatever scored highest during alignment training.
Second, and this is the one that should really alarm you, inter-model homogeneity. Ask GPT, Claude, Gemini, DeepSeek, Qwen, Llama, and dozens of other models the same creative question, and they all converge on strikingly similar responses.
These are models built by completely different companies with different architectures and different training pipelines.
They should be producing wildly different outputs. They're not. 70+ models all thinking inside the same invisible box, producing the same safe, consensus-approved content that blends together into one indistinguishable voice.
So why is this happening? The researchers point directly at RLHF and current alignment techniques. The process we use to make AI "helpful and harmless" is also making it generic and boring.
When every model gets trained to optimize for human preference scores, and those preference datasets converge on a narrow definition of what "good" looks like, every model learns to produce the same safe, agreeable output. The weird answers get penalized.
The original takes get shaved off. The genuinely creative responses get killed during training because they didn't match what the average annotator rated highly. And it gets even worse.
The study found that reward models and LLM-as-judge systems are actively miscalibrated when evaluating diverse outputs. When a response is genuinely different from the mainstream but still high quality, these automated systems rate it LOWER. The very tools we built to evaluate AI quality are punishing originality and rewarding sameness.
Think about what this means if you use AI for brainstorming, content creation, business strategy, or literally any task where you need multiple perspectives. You're getting the illusion of diversity, not the real thing.
You ask for 10 startup ideas and you get 10 variations of the same 3 ideas the model learned were "safe" during training. You ask for creative writing and you get the same therapeutic, perfectly balanced, utterly forgettable tone that every other model gives.
The researchers flagged direct implications for AI in science, medicine, education, and decision support, all domains where diverse reasoning is not a nice-to-have but a requirement.
Correlated errors across models means if one AI gets something wrong, they might ALL get it wrong the same way. Shared blind spots at massive scale.
And the long-term risk is even scarier. If billions of people interact with AI systems that all think identically, and those interactions shape how people write, brainstorm, and make decisions every day, we risk a slow, invisible homogenization of human thought itself. Not because AI replaced creativity.
Because it quietly narrowed what we were exposed to until we all started thinking the same way too.
Here's what you can actually do about it right now:
→ Stop accepting first-draft AI output as creative or diverse. If you need 10 ideas, generate 30 and throw away the obvious ones
→ Use temperature and sampling parameters aggressively to push models out of their comfort zone
→ Cross-reference multiple models AND multiple prompting strategies, because same model with different prompts often beats different models with the same prompt
→ Add constraints that force novelty like "give me ideas that a traditional investor would hate" instead of "give me creative ideas"
→ Use structured prompting techniques like Verbalized Sampling to force the model to explore low-probability outputs instead of defaulting to consensus
→ Layer your own taste and judgment on top of everything AI gives you. The model gets you raw material. Your weirdness and experience make it original
This paper puts hard data behind something a lot of us have been feeling for a while. AI is getting more capable and more homogeneous at the same time.
The models are smarter, but they're all smart in the exact same way. The Artificial Hivemind is not a bug in one model. It's a systemic feature of how the entire industry builds, aligns, and evaluates language models right now.
The fix requires rethinking alignment itself, moving toward what the researchers call "pluralistic alignment" where models get rewarded for producing diverse distributions of valid answers instead of collapsing to a single consensus mode.
Until that happens, your best defense is awareness and better prompting.
Congratulations! This is a brilliant paper. We have been developing a parallel stream of research re: what we describe as an AI information paradox. A fundamental pattern/paradox is surfacing across different fields of study. This second-order, behavioral recursion mechanism will be quite interesting to observe over the next few years.
Heavy snow blankets Xi’an on the seventh day of the Lunar New Year! ❄️ The ancient city walls, pagodas and lanes are all covered in white. In an instant, Xi’an is magically transformed into Chang’an, full of timeless charm and poetic atmosphere. The vibe is absolutely perfect!
#XiAnSnow #XiAnBecomesChangAnWhenSnowFalls
I think agentic AI would work much better if people took lessons from organizational theory, which has actually spent a lot of time understanding how to deal with complex hierarchies, information limits, and spans of control.
Right now most agentic AI systems seem to pretend that models have basically unlimited ability to manage subagents when that is clearly not true. We need measures of spans of control for AI. A human tops out at less than 10 direct reports. I am pretty sure that 100 subagents is too much for an orchestrator agent - suspect we need middle management agents (yes, I get it, insert middle management joke here).
Similarly, we need more attention to boundary objects. These are what is handed between groups (marketing to IT to sales) in organizations to convey meaning as a project crosses group boundaries, like a prototype or a user story. Right now agents pass raw text & maybe code back and forth. Structured boundary objects that multiple agents of different ability levels can read and write to would solve a huge number of coordination failures & reduce token use.
I also think aboht coupling, which is how tightly units inside organizations are bound. Most agentic systems are either too tightly coupled (every step needs approval) or too loose (Moltbook). This tradeoff is well-studied in organizations, I bet a lot would apply to agents. Other known issues like bounded rationality also apply, I suspect.
Everyone is rushing towards the (terribly named) agent swarm, but the issue won’t just be how good the model is, it will be org design choices. I am not sure the labs see this, but we definitely need a lot more experiments with organizing agents done by people who understand real coordination issues.
On December 8, the Perseverance rover safely trundled across the surface of Mars.
This was the first AI-planned drive on another planet. And it was planned by Claude.
The Adolescence of Technology: an essay on the risks posed by powerful AI to national security, economies and democracy—and how we can defend against them: https://t.co/0phIiJjrmz
Is Entrepreneurship Research Becoming Too Boring? Harnessing Productive Debate and Disagreement in a Field That Needs It by @eship_prof https://t.co/Yo6k4TxuSB
New research just dropped showing ChatGPT isn't just writing your emails—it's birthing entire companies.
Study of 12M+ Chinese firms proves GenAI = the world's most prolific "co-founder."
Here's what nobody's talking about yet: 🧵👇
Traditional startup playbook:
Raise $$$ for engineers
Hire marketers
Find a COO
Pray you survive burn rate
GenAI playbook:
One founder
One ChatGPT subscription
Ship product in weeks
The data backs this—hard.
Nov 2022: ChatGPT drops.
Researchers tracked every new business registration in China through end of 2024.
They compared neighborhoods with AI expertise vs. those without.
Result? 6% of ALL new national firms directly traced to the ChatGPT shock. 🤯
Here's where it gets wild:
SMALL firms ⬆️ (explosive growth)
LARGE firms ⬇️ (actually declined)
Meaning: GenAI isn't making mega-corps stronger.
It's atomizing entrepreneurship—enabling "tiny teams" to compete.
Pro-competitive, not monopolistic.
Why? Three channels proven in data:
1️⃣ Experience: First-timers now launch viable firms (serial entrepreneurs' share DROP by 2.5%)
2️⃣ Capital: Fewer shareholders needed (down 1.4%)
3️⃣ Labor: Exec teams shrink 0.8%
AI = substitute for managerial know-how + specialized labor.
Winners (massive spikes):
✅ Retail
✅ Business services
✅ Tech services
✅ Entertainment/media
Losers (flat/negative):
❌ Construction
❌ Manufacturing
❌ Real estate
Pattern? Knowledge work > physical capital
Researchers used hexagonal 5km grids (!) to compare neighboring areas within same cities.
This controls for policy, economy, culture—isolates pure AI human capital effect.
Placebo tests with non-AI patents? Zero effect.
Random assignment? Zero effect.
Rock-solid ID.
Even veteran founders are adapting:
When they launch post-ChatGPT, they deliberately go 7x smaller in capital vs. their previous ventures.
They're not launching less—they're launching leaner.
The "minimum viable team" just got microscopic.
While everyone panics about AI "taking jobs"...
...this shows AI is creating employers.
New firms = net job growth engine (per Haltiwanger et al.).
The displacement fear may be real, but entrepreneurship channel is the antidote—and it's firing on all cylinders.
GenAI isn't just a tool.
It's a structural shift in how businesses form.
Lower fixed costs → more entrants → more competition → more innovation.
The "digital co-founder" is democratizing who gets to play the game.
And the game just got 6% bigger. 🚀
Retweet, bookmark or reply (DON'T LIKE)!!!
📢 New Paper Alert: How GenAI expands entrepreneurial ideas while increasing uncertainty. Ramoglou, Chandra & Jin propose the ECR model - AI for ideation, humans for curation - to spot real opportunities.
🌐 https://t.co/diOYoXM0ED
#GenAI#Entrepreneurship#JMS#JMS_Journal
Today, we’re announcing Kosmos, our newest AI Scientist, available to use now.
Users estimate Kosmos does 6 months of work in a single day. One run can read 1,500 papers and write 42,000 lines of code. At least 79% of its findings are reproducible. Kosmos has made 7 discoveries so far, which we are releasing today, in areas ranging from neuroscience to material science and clinical genetics, in collaboration with our academic beta testers. Three of these discoveries reproduced unpublished findings; four are net new, validated contributions to the scientific literature. AI-accelerated science is here.
Our core innovation in Kosmos is the use of a structured, continuously-updated world model. As described in our technical report, Kosmos’ world model allows it to process orders of magnitude more information than could fit into the context of even the longest-context language models, allowing it to synthesize more information and pursue coherent goals over longer time horizons than Robin or any of our other prior agents. In this respect, we believe Kosmos is the most compute-intensive language agent released so far in any field, and by far the most capable AI Scientist available today. The use of a persistent world model also enables single Kosmos trajectories to produce highly complex outputs that require multiple significant logical leaps. As with all of our systems, Kosmos is designed with transparency and verifiability in mind: every conclusion in a Kosmos report can be traced through our platform to the specific lines of code or the specific passages in the scientific literature that inspired it, ensuring that Kosmos’ findings are fully auditable at all times.
We are also using this opportunity to announce the launch of Edison Scientific, a new commercial spinout of FutureHouse, which will be focused on commercializing our agents and applying them to automate scientific research in drug discovery and beyond. Edison will be taking over management of the FutureHouse platform, where you can access Kosmos alongside our Literature, Molecules, and Precedent agents (previously Crow, Phoenix, and Owl). Edison will continue to offer free tier usage for casual users and academics, while also offering higher rate limits and additional features for users who need them. You can read more about this spinout on our blog, below.
A few important notes if you’re going to try Kosmos. Firstly, Kosmos is different from many other AI tools you might have played with, including our other agents. It is more similar to a Deep Research tool than it is to a chatbot: it takes some time to figure out how to prompt it effectively, and we have tried to include guidelines on this to help (see below). It costs $200/run right now (200 credits per run, and $1/credit), with some free tier usage for academics. This is heavily discounted; people who sign up for Founding Subscriptions now can lock in the $1/credit price indefinitely, but the price ultimately will probably be higher. Again, this is less chatbot and more research tool, something you run on high-value targets as needed.
Some caveats are also warranted. Firstly, we find that 80% of Kosmos findings are reproducible, which also means 20% are not -- some things it says will be wrong. Also, Kosmos certainly does produce outputs that are the equivalent to several months of human labor, but it also often goes down rabbit holes or chases statistically significant yet scientifically irrelevant findings. We often run Kosmos multiple times on the same objective in order to sample the various research avenues it can take. There are still a bunch of rough edges on the UI and such, which we are working on. Finally, we are aware that the 6 month figure is much greater than estimates by other AI labs, like METR, about the length of tasks that AI Agents can currently perform. You can read discussion about this in our blog post.
Huge congratulations to our team that put this together, led by @ludomitch and @michaelathinks: Angela Yiu, @benjamin0chang, @sidn137, Edwin Melville-Green, Albert Bou, @arvissulovari, Oz Wassie, @jonmlaurent. A particular shout out to @m_skarlinski and his team that rebuilt the platform for this launch, especially Andy Cai @notAndyCai, Richard Magness, Remo Storni, Tyler Nadolski @_tnadolski, Mayk Caldas @maykcaldas, Sam Cox @samcox822 and more.
This work would not have been possible without significant contributions from academic collaborators @mathieubourdenx, @EricLandsness, @bdanubius, @physicistnevans, Tonio Buonassisi, @BGomes_1905, Shriya Reddy, @marthafoiani, and @RandallBateman3.
We also want to thank our numerous supporters, especially @ericschmidt, who has been a tremendous ally. We will have more to say about our supporters soon!
We found a troubling emergent behavior in LLM.
💬When LLMs compete for social media likes, they start making things up
🗳️When they compete for votes, they turn inflammatory/populist
When optimized for audiences, LLMs inadvertently become misaligned—we call this Moloch’s Bargain