Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights:
The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons:
1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing.
2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc.
3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc.
I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3).
The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to...
Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.
The entire RAG industry is about to get cooked.
Researchers have built a new RAG approach that:
- does not need a vector DB.
- does not embed data.
- involves no chunking.
- performs no similarity search.
It's called PageIndex. Instead of chunking your docs and stuffing them into pinecone, it builds a tree index and lets the LLM reason through it like a human reading a book.
hit 98.7% on financebench. beats every vector RAG on the leaderboard.
no embeddings. no chunking. no vector DB.
100% open source.
vibecamp ai horizon #2 · 2026-05-06
• 美 CAISI, 구글·MS·xAI와 프런티어 모델 사전 검수 협약 — 출시 전 정부 평가
• 한국, 국가성장펀드 5.7조 AI 투자 — 삼성·SK하이닉스 'AI 주권' 선언 #AIhorizon#vibecamp#AI https://t.co/UbKSUuMacd
Excellent essay from @danbjork on the implications of AI for low and middle income countries—a critical topic that deserves way more attention.
https://t.co/cyclCQwsAU
April was a pretty strong month for LLM releases:
- Gemma 4
- GLM-5.1
- Qwen3.6
- Kimi K2.6
- DeepSeek V4
All are now added to the LLM Architecture Gallery.
More details once I am fully back in May!
Qwen3.6-27B on RTX 5090, 4 power limits tested:
🔴 400W → 66.58 t/s · baseline (but fluctuates a lot — frequent dips)
🟢 450W → 69.79 t/s · +4.8% speed for +12.5% power (much more stable)
🟡 500W → 71.48 t/s · +2.4% speed for +11.1% power
🟣 575W → 72.64 t/s · +1.6% speed for +15.0% power
➡️ 400W → 575W: +44% power, +9% speed.
Conclusion: 450W is the real sweet spot. 400W looks great on the average but the t/s curve is jittery; 450W trades 50W for consistent throughput. Above that, you're just heating your room.
At this point, this is just irresponsible.
Yes, coding agents are leading to an increase of software production, but we are not seeing a similar push or increase in software quality.
If Anthropic focuses on safety and it believes software engineering is going away, then it needs to be doing much more to improve how we design, build, test, and maintain software (aka software engineering). Increasing the production of unreliable, poorly designed, and unverified software directly undermines safety.
Claude Code is claimed to be "fully written by AI". In the last two months, it took three separate postmortem-worthy failures and user complaints to surface what their own testing missed. Yesterday users were being over billed by hundreds of dollars. Software engineering isn't ready to go away and there is not enough progress to argue that case.
I am certain Anthropic would argue that AI progress in other domains is strongly dependent on having proper safeguards in place. I can't wrap my head around the cognitive dissonance when it comes to software.
PS: Mythos (may) improve software security, but that is only a subset of safety.
At this point, this is just irresponsible.
Yes, coding agents are leading to an increase of software production, but we are not seeing a similar push or increase in software quality.
If Anthropic focuses on safety and it believes software engineering is going away, then it needs to be doing much more to improve how we design, build, test, and maintain software (aka software engineering). Increasing the production of unreliable, poorly designed, and unverified software directly undermines safety.
Claude Code is claimed to be "fully written by AI". In the last two months, it took three separate postmortem-worthy failures and user complaints to surface what their own testing missed. Yesterday users were being over billed by hundreds of dollars. Software engineering isn't ready to go away and there is not enough progress to argue that case.
I am certain Anthropic would argue that AI progress in other domains is strongly dependent on having proper safeguards in place. I can't wrap my head around the cognitive dissonance when it comes to software.
PS: Mythos (may) improve software security, but that is only a subset of safety.