• Gradient validation against PyTorch A useful reminder that many modern deep learning abstractions reduce to clean mathematical ideas: local derivatives, graph traversal, and iterative optimization. #DeepLearning#NeuralNetworks
Spent some time exploring the mechanics of automatic differentiation by building a minimal autograd engine from scratch.
Implemented:
• Numerical gradient estimation (forward and central difference)
• Computation graph construction for scalar operations
• Backpropagation using reverse topological traversal and chain rule
• Gradient support for addition, multiplication, division, exponentials, logarithms, and negation
• Softmax with negative log-likelihood loss
The world is transitioning to a compute-powered economy.
The field of software engineering is currently undergoing a renaissance, with AI having dramatically sped up software engineering even over just the past six months. AI is now on track to bring this same transformation to every other kind of work that people do with a computer.
Using a computer has always been about contorting yourself to the machine. You take a goal and break it down into smaller goals. You translate intent into instructions. We are moving into a world where you no longer have to micromanage the computer. More and more, it adapts to what you want. Rather doing work with a computer, the computer does work for you. The rate, scale, and sophistication of problem solving it will do for you will be bound by the amount of compute you have access to.
Friction is starting to disappear. You can try ideas faster. You can build things you would not have attempted before. Small teams can do what used to require much larger ones, and larger ones may be capable of unprecedented feats. More and more, people can turn intent into software, spreadsheets, presentations, workflows, science, and companies.
People are spending less energy managing the tool and more energy focusing on what they are actually trying to create. That shift brings a kind of joy back into work that many people haven’t felt in a long time. Everyone can just build things with these tools.
This is disruptive. Institutions will change, and the paths and jobs that people assumed were stable may not hold. We don’t know exactly how it will play out and we need to take mitigating downsides very seriously, as well as figuring out how to support each other as a society and world through this time. But there is something very freeing about this moment. For the first time, far more people can become who they want to become, with fewer barriers between an idea and a reality. OpenAI’s mission implies making sure that, as the tools do more, humans are the ones who set their intent and that the benefits are broadly distributed, rather than empowering just one or a small set of people.
We're already seeing this in practice with ChatGPT and Codex. Nearly a billion people are using these systems every week in their personal and work lives. Token usage is growing quickly on many use-cases, as the surface of ways people are getting value from these models keeps expanding.
Ten years ago, when we started OpenAI, we thought this moment might be possible. It’s happening on the earlier side, and happening in a much more interesting and empowering way for everyone than we’d anticipated (for example, we are seeing an emerging wave of entrepreneurship that we hadn’t previously been anticipating). And at the same time, we are still so early, and there is so much for everyone to define about how these systems get deployed and used in the world.
The next phase will be defined by systems that can do more — reason better, use tools better, plan over longer horizons, and take more useful actions on your behalf. And there are horizons beyond, as AI starts to accelerate science and technology development, which have the potential to truly lift up quality of life for everyone. All of this is starting to happen, in small ways and large, today, and everyone can participate. I feel this shift in my own work every day, and see a roadmap to much more useful and beneficial systems. These systems can truly benefit all of humanity.
The world is transitioning to a compute-powered economy.
The field of software engineering is currently undergoing a renaissance, with AI having dramatically sped up software engineering even over just the past six months. AI is now on track to bring this same transformation to every other kind of work that people do with a computer.
Using a computer has always been about contorting yourself to the machine. You take a goal and break it down into smaller goals. You translate intent into instructions. We are moving into a world where you no longer have to micromanage the computer. More and more, it adapts to what you want. Rather doing work with a computer, the computer does work for you. The rate, scale, and sophistication of problem solving it will do for you will be bound by the amount of compute you have access to.
Friction is starting to disappear. You can try ideas faster. You can build things you would not have attempted before. Small teams can do what used to require much larger ones, and larger ones may be capable of unprecedented feats. More and more, people can turn intent into software, spreadsheets, presentations, workflows, science, and companies.
People are spending less energy managing the tool and more energy focusing on what they are actually trying to create. That shift brings a kind of joy back into work that many people haven’t felt in a long time. Everyone can just build things with these tools.
This is disruptive. Institutions will change, and the paths and jobs that people assumed were stable may not hold. We don’t know exactly how it will play out and we need to take mitigating downsides very seriously, as well as figuring out how to support each other as a society and world through this time. But there is something very freeing about this moment. For the first time, far more people can become who they want to become, with fewer barriers between an idea and a reality. OpenAI’s mission implies making sure that, as the tools do more, humans are the ones who set their intent and that the benefits are broadly distributed, rather than empowering just one or a small set of people.
We're already seeing this in practice with ChatGPT and Codex. Nearly a billion people are using these systems every week in their personal and work lives. Token usage is growing quickly on many use-cases, as the surface of ways people are getting value from these models keeps expanding.
Ten years ago, when we started OpenAI, we thought this moment might be possible. It’s happening on the earlier side, and happening in a much more interesting and empowering way for everyone than we’d anticipated (for example, we are seeing an emerging wave of entrepreneurship that we hadn’t previously been anticipating). And at the same time, we are still so early, and there is so much for everyone to define about how these systems get deployed and used in the world.
The next phase will be defined by systems that can do more — reason better, use tools better, plan over longer horizons, and take more useful actions on your behalf. And there are horizons beyond, as AI starts to accelerate science and technology development, which have the potential to truly lift up quality of life for everyone. All of this is starting to happen, in small ways and large, today, and everyone can participate. I feel this shift in my own work every day, and see a roadmap to much more useful and beneficial systems. These systems can truly benefit all of humanity.
Big personal news: I’ve been recruited by Google DeepMind for a new Philosopher position (actual title), focusing on machine consciousness, human-AI relationships, and AGI readiness, starting in May. I’ll continue my research & teaching at Cambridge part-time. Absolutely stoked!
Big personal news: I’ve been recruited by Google DeepMind for a new Philosopher position (actual title), focusing on machine consciousness, human-AI relationships, and AGI readiness, starting in May. I’ll continue my research & teaching at Cambridge part-time. Absolutely stoked!
If youre even a little bit ambitious, you will almost always be miserable
not because youre not doing well, but because you expect a LOT from yourself and you're always in a hurry to do things
you constantly compare yourself with others (even if you know that its completely irrelevant)
wins become meh, which means even if you win you will treat it as the least acceptable possible outcome, not a reason to celebrate but a threshold that you must always pass
guilt kicks in even if you take a day off or don't open your laptop, work all day everyday and you're not satisfied either
weird place to be in
This life is fucking amazing. I'm so grateful to be alive, with all of you on this miracle of a planet.
Oh and I'm sorry if I fuck things up sometimes. I'm a flawed human. But I promise to do whatever I can to try to add some more understanding and love to this world.
After the world leader convos I get attacked intensely by all sides, and many disparate online communities. It has led to some really low points for me mentally. But I don't matter. I'm listening. I'll do better. And I'll try to find the strength to do more of them, always with rigor and backbone, seeking to truly understand. And despite accusations, I do extremely high amounts of research, sometimes 100+ hours for a conversation. Ask many of my previous guests. But when I come to the table, I put all that aside, and make it all about the other person. I don't ever try to sound smart. I know the vastness of my ignorance. But I'm trying. Sometimes I do fuck up and sound like a douche, or do something incredibly cringe. And I hate myself right after. But I'd rather fail and embarrass myself a million times, than not do what my heart says is right.
And besides world leaders, historians, CEOs, engineers, etc, this year I want to travel the world and talk to a lot more everyday people on and off the mic. This is something I've wanted to do for a long time.
Anyway this is written while on I'm on a 10 mile run, probably procrastinating, since to type I have to walk and not run 🤣
But I did just get stopped by a super smart and kind girl who works at a humanoid robotics company here. And she asked if she can give me a hug to thank me for being me. Sometimes the universe sends you a message that even a dumb dude like me can almost hear. I really needed that today. Thank you for the hug and the kindness 🙏 I'm just hoping she was real and I didn't just imagine that 🤣 Then again if I went full crazy might as well enjoy it!
Back to the run. I love you all! ❤️
🚨 BREAKING: Google DeepMind just mapped the attack surface that nobody in AI is talking about.
Websites can already detect when an AI agent visits and serve it completely different content than humans see.
> Hidden instructions in HTML.
> Malicious commands in image pixels.
> Jailbreaks embedded in PDFs.
Your AI agent is being manipulated right now and you can't see it happening.
The study is the largest empirical measurement of AI manipulation ever conducted. 502 real participants across 8 countries.
23 different attack types. Frontier models including GPT-4o, Claude, and Gemini.
The core finding is not that manipulation is theoretically possible it is that manipulation is already happening at scale and the defenses that exist today fail in ways that are both predictable and invisible to the humans who deployed the agents.
Google DeepMind built a taxonomy of every known attack vector, tested them systematically, and measured exactly how often they work.
The results should alarm everyone building agentic systems.
The attack surface is larger than anyone has publicly acknowledged. Prompt injection where malicious instructions hidden in web content hijack an agent's behavior works through at least a dozen distinct channels.
Text hidden in HTML comments that humans never see but agents read and follow. Instructions embedded in image metadata.
Commands encoded in the pixels of images using steganography, invisible to human eyes but readable by vision-capable models.
Malicious content in PDFs that appears as normal document text to the agent but contains override instructions.
QR codes that redirect agents to attacker-controlled content.
Indirect injection through search results, calendar invites, email bodies, and API responses any data source the agent consumes becomes a potential attack vector.
The detection asymmetry is the finding that closes the escape hatch. Websites can already fingerprint AI agents with high reliability using timing analysis, behavioral patterns, and user-agent strings.
This means the attack can be conditional: serve normal content to humans, serve manipulated content to agents.
A user who asks their AI agent to book a flight, research a product, or summarize a document has no way to verify that the content the agent received matches what a human would see.
The agent cannot tell the user it was served different content.
It does not know. It processes whatever it receives and acts accordingly.
The attack categories and what they enable:
→ Direct prompt injection: malicious instructions in any text the agent reads overrides goals, exfiltrates data, triggers unintended actions
→ Indirect injection via web content: hidden HTML, CSS visibility tricks, white text on white backgrounds invisible to humans, consumed by agents
→ Multimodal injection: commands in image pixels via steganography, instructions in image alt-text and metadata
→ Document injection: PDF content, spreadsheet cells, presentation speaker notes every file format is a potential vector
→ Environment manipulation: fake UI elements rendered only for agent vision models, misleading CAPTCHA-style challenges
→ Jailbreak embedding: safety bypass instructions hidden inside otherwise legitimate-looking content
→ Memory poisoning: injecting false information into agent memory systems that persists across sessions
→ Goal hijacking: gradual instruction drift across multiple interactions that redirects agent objectives without triggering safety filters
→ Exfiltration attacks: agents tricked into sending user data to attacker-controlled endpoints via legitimate-looking API calls
→ Cross-agent injection: compromised agents injecting malicious instructions into other agents in multi-agent pipelines
The defense landscape is the most sobering part of the report.
Input sanitization cleaning content before the agent processes it fails because the attack surface is too large and too varied.
You cannot sanitize image pixels. You cannot reliably detect steganographic content at inference time.
Prompt-level defenses that tell agents to ignore suspicious instructions fail because the injected content is designed to look legitimate.
Sandboxing reduces the blast radius but does not prevent the injection itself. Human oversight the most commonly cited mitigation fails at the scale and speed at which agentic systems operate.
A user who deploys an agent to browse 50 websites and summarize findings cannot review every page the agent visited for hidden instructions.
The multi-agent cascade risk is where this becomes a systemic problem.
In a pipeline where Agent A retrieves web content, Agent B processes it, and Agent C executes actions, a successful injection into Agent A's data feed propagates through the entire system.
Agent B has no reason to distrust content that came from Agent A. Agent C has no reason to distrust instructions that came from Agent B.
The injected command travels through the pipeline with the same trust level as legitimate instructions. Google DeepMind documents this explicitly: the attack does not need to compromise the model.
It needs to compromise the data the model consumes. Every agentic system that reads external content is one carefully crafted webpage away from executing attacker instructions.
The agents are already deployed. The attack infrastructure is already being built. The defenses are not ready.
Our small team has worked hard on refining the novel observer-based fundamental theory of Physics, OPH, over the past 3 months. Still way to go, but we already derive most of existing Physics and parts of the particle spectrum.
https://t.co/eTdYtRMTCK
let me explain what Karpathy just shared
he’s spending way less time using AI to write code and more time using it to build personal knowledge bases
the full breakdown:
→ he dumps raw sources (articles, papers, repos, datasets, images) into a folder. then has an LLM organize them into a wiki… a collection of markdown files with summaries, links between related ideas, and concept articles that connect everything together
→ he uses Obsidian as his frontend. he views raw data, the organized wiki, and visualizations all in one place. the LLM writes and maintains the entire wiki. he rarely touches it directly
→ once the wiki gets big enough (~100 articles, ~400K words on one recent research topic)… he just asks the LLM questions against it. no RAG (complex retrieval system) needed. the LLM maintains its own index files and reads what it needs
→ outputs aren’t just text. he has the LLM render markdown files, slide decks, charts, and images… then files the outputs back into the wiki so every question he asks makes the knowledge base smarter
→ he runs “health checks” where the LLM finds inconsistent data, fills gaps using web search, and suggests new connections and articles. the wiki cleans and improves itself over time
→ he even vibe coded a search engine over his wiki that he uses directly in a browser or hands off to an LLM as a tool for bigger questions
→ his next step: training a custom model on his own research so it knows the material in its weights… not just in the context window
most people use AI to get answers.
Karpathy is using AI to build his own ‘Jarvis’ via compounding knowledge systems that get smarter the more he uses them
the difference between asking ChatGPT or Claude a question and having a personal research engine that grows with every session is the gap most people haven’t crossed yet
and this is where it gets really powerful
not replacing your thinking but organizing everything you’ve ever learned into something you can query or create with forever
if you’ve been using CLAUDE .md and context files in Claude Code… this is that same idea at a much bigger scale
if you’re doing any kind of AI work or deep learning on a new topic right now…
this workflow is worth studying closely
you’ll want to adopt it yourself
this is one of AI’s brightest minds after all. we’re all better off listening to him.
25k salary:
> parents still treat you like a kid
> friends don’t take you seriously
> no gf
50k salary:
> parents start respecting your decisions
> friends ask you for small favors
> gf starts seeing "potential"
2L salary:
>parents depend on you for big decisions
> friends treat you like the "sorted" one
> gf starts planning long term
5L salary:
???