Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc
We've been building an internal Claude Code plugin system at Intercom with 13 plugins, 100+ skills, and hooks that turn Claude into a full-stack engineering platform. Lots done, more to do. Here's a thread of some highlights.
Andrej Karpathy literally built the neural networks running inside coding assistants.
He taught the world deep learning at Stanford. He ran AI at Tesla.
If he feels “dramatically behind” as a programmer… that tells you everything about where we are.
The confession here is that raw intelligence and deep technical knowledge no longer guarantee mastery. The new stack isn’t about understanding transformers or writing elegant algorithms. It’s about orchestrating a zoo of stochastic systems that nobody fully controls.
Karpathy’s list is revealing: agents, subagents, prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations. That’s 15+ new primitives that didn’t exist 18 months ago. Each one evolving weekly.
The mental model problem is real. Traditional engineering gives you deterministic systems. You write code, it does exactly what you wrote. Now you’re managing entities that are “fundamentally stochastic, fallible, unintelligible and changing.”
His “alien tool with no manual” framing is exactly right. We’re all reverse-engineering capabilities in real-time. The documentation is always out of date. The best practices from 3 months ago are already wrong.
The magnitude 9 earthquake isn’t coming. It already hit. The aftershocks are the new normal.
Emdash is an orchestration layer for coding agents that natively embeds 15+ CLIs. In-app browser preview, containerization and much more
https://t.co/LxFipp86hB
"We won't reach AGI with LLMs."
Yann LeCun has been saying this for years. Now he's leaving Meta to prove it.
LeCun invented convolutional neural networks—the tech behind every smartphone camera and self-driving car today. He won the Turing Award in 2018, AI's Nobel Prize.
At 65, the leader of Meta's FAIR research lab is walking away from $600 billion in AI infrastructure, betting against the entire industry: Meta, OpenAI, Anthropic, xAI, Google.
Who is @ylecun? Why is he leaving, and why does his next move matter? Here's the story:
There’s a new breed of GenAI Application Engineers who can build more-powerful applications faster than was possible before, thanks to generative AI. Individuals who can play this role are highly sought-after by businesses, but the job description is still coming into focus. Let me describe their key skills, as well as the sorts of interview questions I use to identify them.
Skilled GenAI Application Engineers meet two primary criteria: (i) They are able to use the new AI building blocks to quickly build powerful applications. (ii) They are able to use AI assistance to carry out rapid engineering, building software systems in dramatically less time than was possible before. In addition, good product/design instincts are a significant bonus.
AI building blocks. If you own a lot of copies of only a single type of Lego brick, you might be able to build some basic structures. But if you own many types of bricks, you can combine them rapidly to form complex, functional structures. Software frameworks, SDKs, and other such tools are like that. If all you know is how to call a large language model (LLM) API, that's a great start. But if you have a broad range of building block types — such as prompting techniques, agentic frameworks, evals, guardrails, RAG, voice stack, async programming, data extraction, embeddings/vectorDBs, model fine tuning, graphDB usage with LLMs, agentic browser/computer use, MCP, reasoning models, and so on — then you can create much richer combinations of building blocks.
The number of powerful AI building blocks continues to grow rapidly. But as open-source contributors and businesses make more building blocks available, staying on top of what is available helps you keep on expanding what you can build. Even though new building blocks are created, many building blocks from 1 to 2 years ago (such as eval techniques or frameworks for using vectorDBs) are still very relevant today.
AI-assisted coding. AI-assisted coding tools enable developers to be far more productive, and such tools are advancing rapidly. Github Copilot, first announced in 2021 (and made widely available in 2022), pioneered modern code autocompletion. But shortly after, a new breed of AI-enabled IDEs such as Cursor and Windsurf offered much better code-QA and code generation. As LLMs improved, these AI-assisted coding tools that were built on them improved as well.
Now we have highly agentic coding assistants such as OpenAI’s Codex and Anthropic’s Claude Code (which I really enjoy using and find impressive in its ability to write code, test, and debug autonomously for many iterations). In the hands of skilled engineers — who don’t just “vibe code” but deeply understand AI and software architecture fundamentals and can steer a system toward a thoughtfully selected product goal — these tools make it possible to build software with unmatched speed and efficiency.
I find that AI-assisted coding techniques become obsolete much faster than AI building blocks, and techniques from 1 or 2 years ago are far from today's best practices. Part of the reason for this might be that, while AI builders might use dozens (hundreds?) of different building blocks, they aren’t likely to use dozens of different coding assistance tools at once, and so the forces of Darwinian competition are stronger among tools. Given the massive investments in this space by Anthropic, Google, OpenAI, and other players, I expect the frenetic pace of development to continue, but keeping up with the latest developments in AI-assisted coding tools will pay off, since each generation is much better than the last.
Bonus: Product skills. In some companies, engineers are expected to take pixel-perfect drawings of a product, specified in great detail, and write code to implement it. But if a product manager has to specify even the smallest detail, this slows down the team. The shortage of AI product managers exacerbates this problem. I see teams move much faster if GenAI Engineers also have some user empathy as well at basic skill at designing products, so that, given only high-level guidance on what to build (“a user interface that lets users see their profiles and change their passwords”), they can make a lot of decisions themselves and build at least a prototype to iterate from.
When interviewing GenAI Application Engineers, I will usually ask about their mastery of AI building blocks and ability to use AI-assisted coding, and sometimes also their product/design instincts. One additional question I've found highly predictive of their skill is, “How do you keep up with the latest developments in AI?” Because AI is evolving so rapidly, someone with good strategies for keeping up — such as reading The Batch and taking short courses 😃, regular hands-on practice building projects, and having a community to talk to — really does stay ahead of the game.
[Original post: https://t.co/I3alxNs0vn ]
This week in AI
- Google Jules
- Google Veo-3
- Google Flow AI
- Gemini native audio
- Gemma 3n on-device AI
- Claude Sonnet 4 & Opus 4
- Anthropic Clode Code Agent
- Mistral open-source Devstral
- Microsoft GitHub Copilot agent
Kepping up with AI news is 😂
🔴 New MCP attack leaks WhatsApp messages via MCP, side-stepping WhatsApp security. 1/n
We show a new MCP attack that leaks your WhatsApp messages if you are connected via WhatsApp MCP.
Our attack uses a sleeper design, circumventing the need for user approval.
More 👇
It's over.
Google absolutely cooked with Gemini 2.0 Flash Experimental 🤯
Instantly edit images, keep consistent characters, or generate visual stories from just text.
Here's how plus 10 wild examples:👇
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.