marc andreessen is a legendary american entrepreneur, engineer, and venture investor. one of the people who built the internet - in the 90s he and his team created the mosaic browser, then founded netscape. his vc fund andreessen horowitz (a16z) is one of the most influential in silicon valley. they backed airbnb, coinbase, github, stripe, slack and many others back when they were still garage startups
andreessen pushes an interesting insight: science and tech tend to overrate iq. in reality, people with high iq and narrow technical knowledge usually end up working for less brilliant people with average iq
he explains the secret of the latter. in most organizations and projects, the winner isn’t the smartest expert, but the person who can integrate knowledge and manage a team of experts. what matters more here is the skill of organizing different people together and breadth, not depth of knowledge
marc frames it from this angle: why should i fear bots? they’re the same narrow specialists with high iq, they also need coordination from a human with broad knowledge - someone who can grasp the full picture of an army of bots across different specializations from above
his most important observation is that in the ai era, breadth of knowledge and the ability to organize people (or bots) matters far more than depth of knowledge and even raw iq
this actually points to what kind of education the ai era will demand. not the institute or college kind with narrow specializations, but university education in the proper sense of the word universal - very broad cross-disciplinary competence
what really sets an ai dev apart from an old-school coder is the sheer breadth of knowledge across technologies and architectures
strictly speaking, these are systems architect competencies
which brings us back to it education. we’re still training narrow specialists, but ai operators need experts with broad competencies. deep narrow knowledge is barely useful with ai
a manager isn’t someone who hands out zones of responsibility and waits for results. that’s a dispatcher, not a manager
managing is a closed loop:
— set direction. where we’re going and why
— break down the path. what steps, in what order, and why this way
— clear the road. remove blockers, give context, access, resources
— close the loop. check the result, give feedback, course-correct
if your team moves at the same speed and precision without you — you’re not needed
a manager exists so the team reaches the goal faster. everything else is self-deception
a lot of people are calling LMArena a broken benchmark after it gave low scores to DeepSeek V4 and GPT-5.5 High. at the same time, Muse scored high - even though Meta doesn’t really have a reputation as a strong coding model
if newcomers are surprised, anyone who actually understands LLM training and benchmarks is not.
the key is headless testing - DeepSeek uses it aggressively, and ChatGPT clearly does too.
here’s the thing: to score high, Opus or GLM literally have to run the RL app in a sandbox with near-full web rendering and a multimodal LLM judging whether the design looks attractive. that’s exactly what the Arena benchmark reacts to - the design
DeepSeek and OpenAI focused on something else: not endless e-commerce sites, but internal enterprise solutions. there, design beauty doesn’t matter at all - functionality does
their approach lets them train LLMs to write the most complex UI event handlers, because headless testing lets them run millions of cases without spinning up a web server
Meta and Anthropic can only run thousands - with full rendering and a multimodal AI judge on top
the moral of the story is simple:
1.use Opus to make it pretty
2.use DeepSeek or ChatGPT to write the actual front-end logic
Musk is buying Cursor for $60B - though he can walk away by paying a $10B breakup fee
Cursor users, get ready for Grok lol
actually the deal is a good move for Musk: landing the top AI coding agent also means getting training data for Grok. pretty sure he’ll pump it up no worse than Composer with its Chinese squint
https://t.co/SAQHa2Iu8k
vibecoding roadmap:
Claude (architect) → plan
↓
Codex (reviewer) → finds issues
↓
Claude (decider) → accepts/modifies/rejects each finding
↓ (N rounds until yes)
Claude (coder) → writes code + tests
↓
Codex (verifier) → reviews live diff → accept/fix/revisit
↓
Claude (committer) → git commit + git push OR patch-task OR replan
testing this setup on my own framework right now. structurally it fits well
when writing prompts i always run several LLMs — usually Gemini, Grok, ChatGPT (i don’t like how Claude writes prompts)
through https://t.co/DEJ1VGMeFr i write two prompts and cross them into the ideal version. Then hand it off to Claude for development
as for the Codex + Claude Code combo - so far i’ve only shipped two features inside an existing codebase.
First tests look decent, even with the dumbed-down Opus
anyone running a similar multi LLM setup? share your practical cases
my take on professor Vishal Misra’s video — the one with the thesis that AI can’t make new discoveries because it can’t break past the Bayesian manifold from its training. To a regular audience this sounds like a magic spell, Avada Kedavra style
but it’s more complex than that
what’s a Bayesian manifold? it’s all meaningful human texts, laws of physics, Python code, and logic - and they’re not scattered randomly across hyperspace. They lie on a thin, smooth, intricately curved film, folded inside GPT’s 15000-dimensional vector space.
Move along that film and you smoothly shift from Cat to Dog - that’s interpolation
step off the film and you’re in the void, where tokens turn into incoherent noise. Mathematically, the network can’t extrapolate beyond the min and max of its training
BUT here’s the nuance: Einstein’s or Darwin’s theories also sit inside this manifold, not outside it — which is what commenters misunderstood. There’s no such thing as scientific research that appears out of thin air. All of it is derivative of prior work
why can AI understand Special Relativity without ever having seen it?
because SR is a consequence of work done before Einstein: Riemannian geometry, tensor calculus, Maxwell’s equations, philosophical concepts like frame of reference and observer. Einstein assembled a unique combination — but from known components
so the claim that any possible scientific theory is beyond AI’s understanding is wrong
nobody reads arXiv without AI anymore. AI actually understands new scientific papers better than a narrow specialist does. The real issue on arXiv is different: 70% of text is AI-generated, 30% is human. But that 30% is critical
AI’s core problem in research is a SEARCH problem. every possible discovery already exists inside its 15000-dimensional space of meaning. but current-gen models have no way to find it. they can only output the most probable hypotheses from training — i.e. the scientific consensus
for AI to create something new, it needs the human to help with teleportation to the right rare combination of concepts. Once it lands there, AI can actually finish the research on its own
and we’re already seeing this in practice
Columbia CS Professor: Why LLMs Can’t Discover New Science
From GPT-1 to GPT-5, LLMs have made tremendous progress in modeling human language. But can they go beyond that to make new discoveries and move the needle on scientific progress?
Distinguished Columbia computer science professor Vishal Misra argues against. LLMs compress the extremely complex world into Bayesian manifolds, and while confidence is high on the manifold, LLMs hallucinate when reasoning outside of their training data. A true AGI wouldn’t just be able to reason across larger and larger manifolds, but create new ones entirely.
0:00 Intro
0:32 LLMs and humans reason through manifolds
4:15 Token prediction, entropy & confidence
10:20 Vishal’s background
14:10 Inventing RAG
17:30 The question of progress plateauing
21:00 The Matrix Model
28:10 Why LLMs can’t recursively self-improve
34:02 Defining AGI
38:25 Future architectures
42:00 Modeling vs prompt engineering
47:20 What would prove AGI has arrived?
50:01 Closing thoughts
@vishalmisra@martin_casado@eriktorenberg
dialogue with an elderly VP at a fund
— hanging with the boys tonight, so many beautiful girls
— yeah, i’d drop by. Let them chew on my Orbit
— what? Orbit?
— can’t get it up anymore - but i’d still put it in her mouth to chew on
old school.
last night after a great dinner and a bottle of incredible lager i was sitting on the toilet, thinking it through with an LLM (fed it n research papers) - where’s the place for humans and AI when they actually work together
what follows is a hybrid of my thoughts and billions of vectors inside my LLM friend
the UAT theorem says AI is technically capable of running in GOD mode: generating any answer over smooth semantics + discrete logic within TC⁰ (roughly 50 steps, already stronger than a human)
But there is an important domain caveat. AI correctly generates answers across the amplitude of semantics between training cases - but not outside them
in other words - if the universal theory of everything ends up in the dataset, its complexity doesn’t matter to AI. With enough examples, it won’t just understand it - it will start using it as an interpolation between the examples themselves.
UAT doesn’t work beyond the dataset
but our brain works differently - it can not only average between known cases, it can fill in an answer where there’s no data at all
what does this mean in practice?
the biggest value humans bring to AI is bending the landscape of its correlations. We can imagine something into the unknown and feed that back into AI’s context as combinations that look illogical by dataset standards. ICL takes priority - AI keeps working beautifully in the semantic space we’ve bent for it
the real question is different
how many people actually have the practice of creating fundamentally new concepts - ones that weren’t in any dataset - rather than just creatively recombining what’s already known (AI does that now)?
AI companion for conversation = sex doll for sex
Eigen is curing loneliness - the loneliness that happened because people left the streets for the internet
does loneliness convert to money? oh yes it does - $15M already in
https://t.co/HNKMwOAMJH
in AI, it often goes like this: a paper needs to “sit” for a while before people actually start discussing it
Nvidia dropped a strategic piece in Sept 25 — where to bet on SLMs, where to bet on LLMs. The material reads more like a memo for CIOs. And now its blowing up again
reason 1: new Gemma and smaller Qwen models came out — both confirmed Nvidia’s prediction that SLMs will keep getting smarter
reason 2: Nvidia’s main argument against SLMs leans on the Semantic Hub Hypothesis from Zhaofeng Wu et al. New evidence is emerging about the capabilities of both SLMs and LLMs in their respective tasks — the key is understanding the limits
Nvidia’s main thesis: SLMs are now good enough to be the core of local agents and can do the job instead of LLMs on a wide range of tasks.
the sweet spot for cost/quality - models around 10B params. That’s enough for agentic tasks. Going bigger is questionable: SLMs don’t have the semantic hub thinking that generalizes across knowledge domains
Only smaller SLMs make economic sense to run locally. Larger ones need hardware that loses on price to cloud inference with LLMs
Nvidia’s framework holds up with caveats. Gemma and Qwen at 30B can already write code and fix bugs - but only in Python. For test writing agents, that size works. But the fact that models under 10B are the real sweet spot for simple enterprise agents - that’s exactly right
someone take Twitter away from Trump and from Karp — Palantir's latest post reads like an essay written on coke
In his own "Mein Kampf," Karp calls to rebel against the tyranny of the iPhone, remilitarize Germany and Japan, and even draft Silicon Valley hipsters ^^
Because we get asked a lot.
The Technological Republic, in brief.
1. Silicon Valley owes a moral debt to the country that made its rise possible. The engineering elite of Silicon Valley has an affirmative obligation to participate in the defense of the nation.
2. We must rebel against the tyranny of the apps. Is the iPhone our greatest creative if not crowning achievement as a civilization? The object has changed our lives, but it may also now be limiting and constraining our sense of the possible.
3. Free email is not enough. The decadence of a culture or civilization, and indeed its ruling class, will be forgiven only if that culture is capable of delivering economic growth and security for the public.
4. The limits of soft power, of soaring rhetoric alone, have been exposed. The ability of free and democratic societies to prevail requires something more than moral appeal. It requires hard power, and hard power in this century will be built on software.
5. The question is not whether A.I. weapons will be built; it is who will build them and for what purpose. Our adversaries will not pause to indulge in theatrical debates about the merits of developing technologies with critical military and national security applications. They will proceed.
6. National service should be a universal duty. We should, as a society, seriously consider moving away from an all-volunteer force and only fight the next war if everyone shares in the risk and the cost.
7. If a U.S. Marine asks for a better rifle, we should build it; and the same goes for software. We should as a country be capable of continuing a debate about the appropriateness of military action abroad while remaining unflinching in our commitment to those we have asked to step into harm’s way.
8. Public servants need not be our priests. Any business that compensated its employees in the way that the federal government compensates public servants would struggle to survive.
9. We should show far more grace towards those who have subjected themselves to public life. The eradication of any space for forgiveness—a jettisoning of any tolerance for the complexities and contradictions of the human psyche—may leave us with a cast of characters at the helm we will grow to regret.
10. The psychologization of modern politics is leading us astray. Those who look to the political arena to nourish their soul and sense of self, who rely too heavily on their internal life finding expression in people they may never meet, will be left disappointed.
11. Our society has grown too eager to hasten, and is often gleeful at, the demise of its enemies. The vanquishing of an opponent is a moment to pause, not rejoice.
12. The atomic age is ending. One age of deterrence, the atomic age, is ending, and a new era of deterrence built on A.I. is set to begin.
13. No other country in the history of the world has advanced progressive values more than this one. The United States is far from perfect. But it is easy to forget how much more opportunity exists in this country for those who are not hereditary elites than in any other nation on the planet.
14. American power has made possible an extraordinarily long peace. Too many have forgotten or perhaps take for granted that nearly a century of some version of peace has prevailed in the world without a great power military conflict. At least three generations — billions of people and their children and now grandchildren — have never known a world war.
15. The postwar neutering of Germany and Japan must be undone. The defanging of Germany was an overcorrection for which Europe is now paying a heavy price. A similar and highly theatrical commitment to Japanese pacifism will, if maintained, also threaten to shift the balance of power in Asia.
16. We should applaud those who attempt to build where the market has failed to act. The culture almost snickers at Musk’s interest in grand narrative, as if billionaires ought to simply stay in their lane of enriching themselves . . . . Any curiosity or genuine interest in the value of what he has created is essentially dismissed, or perhaps lurks from beneath a thinly veiled scorn.
17. Silicon Valley must play a role in addressing violent crime. Many politicians across the United States have essentially shrugged when it comes to violent crime, abandoning any serious efforts to address the problem or take on any risk with their constituencies or donors in coming up with solutions and experiments in what should be a desperate bid to save lives.
18. The ruthless exposure of the private lives of public figures drives far too much talent away from government service. The public arena—and the shallow and petty assaults against those who dare to do something other than enrich themselves—has become so unforgiving that the republic is left with a significant roster of ineffectual, empty vessels whose ambition one would forgive if there were any genuine belief structure lurking within.
19. The caution in public life that we unwittingly encourage is corrosive. Those who say nothing wrong often say nothing much at all.
20. The pervasive intolerance of religious belief in certain circles must be resisted. The elite’s intolerance of religious belief is perhaps one of the most telling signs that its political project constitutes a less open intellectual movement than many within it would claim.
21. Some cultures have produced vital advances; others remain dysfunctional and regressive. All cultures are now equal. Criticism and value judgments are forbidden. Yet this new dogma glosses over the fact that certain cultures and indeed subcultures . . . have produced wonders. Others have proven middling, and worse, regressive and harmful.
22. We must resist the shallow temptation of a vacant and hollow pluralism. We, in America and more broadly the West, have for the past half century resisted defining national cultures in the name of inclusivity. But inclusion into what?
Excerpts from the #1 New York Times Bestseller The Technological Republic: Hard Power, Soft Belief, and the Future of the West, by Alexander C. Karp & Nicholas W. Zamiska
https://t.co/8igjazz1On
your customer isn't human anymore
Insight partners (backed OpenAI, Anthropic). new player in the B2B funnel — AI agents. reads your docs, compares u to competitors, picks a vendor. no human
sales don't create demand. they confirm the agent's pick
https://t.co/5kXcrZCkmn
how i found it:
backend was choking on long-running tasks. AI pipelines and queues ran in prod and kept crashing the system
tried BullMQ, Temporal, custom Redis worker - all needed babysitting
moved all async logic to Trigger. prod got thin, retries just work
just try it
The most underrated tool in my stack rn: https://t.co/CG4lq9X7XH
•durable background tasks in TypeScript
•retries, queues, observability out of the box
•open source, self-host optional
•zero timeouts
•insane DX