Andrej Karpathy just explained the future of software engineering without directly saying it.
The best AI engineers are no longer “prompting.”
They’re building systems around the agents.
Karpathy’s biggest insight wasn’t:
“Claude can code.”
It was:
LLMs become dramatically better when you force them into disciplined workflows.
That’s why "CLAUDE.md" files are suddenly everywhere.
Not because they’re prompts.
Because they behave like an operating system for the agent.
Karpathy called out the exact problems with AI coding:
- models assume instead of asking
- they overengineer simple tasks
- they hide confusion
- they rewrite unrelated code
- they optimize for completion, not correctness
So developers started encoding rules directly into the workflow:
→ Think before coding
→ Simplicity first
→ Surgical edits only
→ Goal-driven execution
And the results are wild.
People are now running multiple Claude Code agents in parallel like engineering teams:
• one agent researching
• one debugging
• one writing tests
• one optimizing code
• one validating outputs
Not “AI assistance.”
Actual orchestration.
And this part from Karpathy changes everything:
“Don’t tell the model what to do. Give it success criteria and let it loop.”
That is the shift.
From:
“write this function”
To:
“here’s the goal, constraints, tests, and verification system — now iterate until correct.”
The craziest part?
This already feels like a phase shift in engineering.
A lot of developers quietly went from:
80% manual coding → to 80% agent-driven coding in just months.
Not because AI became perfect.
Because the leverage became impossible to ignore.
We’re entering an era where the highest leverage engineers won’t necessarily be the best coders.
They’ll be the people who build the best systems around AI agents.
Andrej Karpathy just revealed why AI coding feels 10x better than it did a year ago.
The real unlock in AI coding isn't a smarter model.
It's the process wrapped around it.
The best engineers aren't spending their time crafting better prompts.
They're creating systems with rules, checkpoints, and validation loops that push AI to reason, test, and improve its own work.
We're moving from "AI-assisted coding" to managing fleets of AI agents working toward a goal.
The next generation of great software engineers won't just write code.
They'll know how to direct, coordinate, and scale AI like an engineering organization.
Google just dropped a 12B model that has no business being this fast.
I ran Gemma 4 12B locally on an RTX 4060 and got 21 tok/s.
No API. No cloud. No subscription.
Just 6.6GB, 256K context, and benchmarks that look straight-up unfair.
→ 77.5% AIME
→ 78.8% GPQA Diamond
→ 72% LiveCodeBench
→ 1659 Codeforces ELO
The craziest part?
Most multimodal models bolt together separate vision/audio encoders with an LLM.
Gemma 4 doesn't.
Images and audio are projected directly into the same decoder-only transformer.
No encoder tax. No extra latency. No memory bloat.
Google basically turned a multimodal model into something that runs like a normal local LLM.
Open source AI is moving way faster than most people realize.
Link👇
Noah got drunk.
Jonah ran away.
Moses stuttered.
Abraham was old.
Lazarus was dead.
Paul was a murderer.
Sarah was impatient.
Elijah was depressed.
Thomas was a doubter.
Yet, they were chosen by God.
Dear son, God doesn’t call the qualified; He qualifies the called.
Dear brother,
At 23 years old, you got so much time
At 28 years old, you got so much time
At 35 years old, you got so much time
At 40 years old, you got so much time
At 48 years old, you got so much time
At 55 years old, you got so much time
At 63 years old, you got so much time
At 80 years old, you got so much time
NO MATTER YOUR AGE.. YOU HAVE TIME
I callenge you to;
-Delete the porn apps
-No alcohol for 30 days
-Eat real food, start with eggs
-Drink water like it's your job
-Be in bed before the clock hits midnight
-Walk until your thoughts get quiet
-Drop and do 100 push-ups every single day
-Skip breakfast, your body thanks you later
-Write something, anything, 100 words minimum
-Find 3 reasons you have no right to complain
Most men will save this and do nothing.
Prove you are not like most men.
THE ENTIRE OFFSEC CURRICULUM JUST GOT REPACKAGED AS CLAUDE SKILLS.
OSCP costs $1,649. OSEP costs around $2,500. OSED costs another $2,500. SANS courses run $8,000 each. A Burp Suite Pro license is $475 a year. A senior pentester clears $180k.
A guy on the internet named Kai Aizen just put the methodology behind all of it into 58 SKILL.md files and pushed them to GitHub for free.
The pack is called claude-red. It primes Claude with expert-level offensive methodology across 13 categories the certification industry charges five figures to teach:
- Web app exploitation → 16 skills (the whole OWASP Top 10 and then some)
- Active Directory → Kerberoasting, ASREProast, ADCS ESC1 through ESC15, delegation abuse, NTLM relay, hybrid AAD pivots
- Wireless → WPA2/3 cracking, evil twin RADIUS, Dragonblood, KRACK, BLE, Zigbee, Z-Wave, LoRaWAN
- Cloud → AWS/Azure/GCP privesc, IMDS abuse, cross-account persistence
- Exploit dev → modern kernel mitigations, ROP, CFG/CET/PAC bypass theory
- EDR evasion → unhooking, indirect syscalls, PPID spoofing
- AI security → prompt injection, jailbreaks, RAG poisoning
Here's the part the cert industry doesn't want you to think about:
The actual methodology behind every OSCP-style course is publicly documented in OWASP guides, PortSwigger Academy, HackTricks, BloodHound docs, ADSecurity, the Shellcoder's Handbook, and a hundred Black Hat talks.
claude-red just organizes it into context-aware skills that load on demand inside Claude.
1,931 stars in three months. 314 forks. MIT license. 22 commits because the author dropped the whole library at once.
One honest note: this is for authorized red team work, bug bounty programs you're scoped for, and CTF prep. Hitting things you don't have permission to hit is a felony in most jurisdictions and no skill file will save you from that.
The five-figure certification industry just got a peer it didn't ask for.
Repo in the first comment.