today’s amazing new AI-designed artifacts will look like slop in a month, once everyone learns to recognize the patterns the model falls back on. like AI-generated writing, the output isn’t objectively “bad,” (in fact it is often technically quite good), but once it becomes predictable, it reveals itself as recognizably “AI.”
this is undesirable because it exposes two separate skill issues:
1. the person lacks the design (or writing) taste to realize their work reads as obviously “AI”
2. they also lack the prompting skill to steer the model away from its default patterns
this is why there will always be a signaling arbitrage opportunity in keeping a human in the loop for creative and many kinds of knowledge work, no matter how good the tools/models get
In fact programming is particularly good at teaching you how to think. Programs are literally built out of ideas. Which is one reason it's still a good idea to study CS in the AI era. It's still important to be able to think well.
Read the lost thesis of Dennis Ritchie, creator of the C programming language & co-creator of Unix: https://t.co/icflzD4aJl
Ritchie never got his PhD b/c he didn't want to pay Harvard the thesis binding fee.
(v/@IEEESpectrum)
A single 𝗖𝗟𝗔𝗨𝗗𝗘.𝗺𝗱 file just hit 15K GitHub stars.
(derived from Karpathy's coding rules)
Andrej Karpathy observed that LLMs make the same predictable mistakes when writing code: over-engineering, ignoring existing patterns, and adding dependencies you never asked for.
If you've used AI coding assistants, you've hit all of these.
But here's the thing:
If the mistakes are predictable, you can prevent them with the right instructions.
That's exactly what this 𝗖𝗟𝗔𝗨𝗗𝗘.𝗺𝗱 does. You drop one markdown file into your repo, and it gives Claude Code a structured set of behavioral guidelines for your entire project.
This is a big deal.
- Built entirely around prompt engineering for AI coding assistants
- No framework, no complex tooling, just one .md file that shapes behavior
Developers are moving past "use AI to write code" and into "engineer the AI's behavior so the code is actually good."
The Claude Code ecosystem is growing fast, and the best tools in it aren't always software. Sometimes they're just well-crafted instructions.
100% open-source.
I've shared a link to the GitHub repo in the next tweet!
Don't start a startup in high school. What if it works? You'll lose the opportunity you'd otherwise have to explore random, interesting ideas, driven only by curiosity. Because while you will indeed learn a lot from a startup, you won't have any choice about what you learn.
#otd in 1980 a Japanese computer scientist published a paper proposing the “Neocognitron,” the neural net that directly inspired CNNs: https://t.co/v7TCOMPN6x
Kunihiko Fukushima’s paper explained back in 1986: https://t.co/vaIJlc5GdV
Judging by my tl there is a growing gap in understanding of AI capability.
The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code.
But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along.
So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions.
TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale.
It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days.
Now in public beta on the Claude Platform.
I made a Claude Code skill that turns any arxiv paper into working code.
Every line traces back to the paper section it came from & any implementation detail the paper skips will be flagged, and not assumed.
open sourcing it -
https://t.co/sSio4JfpIo
5 minutes ago, @karpathy just dropped karpathy/jobs!
he scraped every job in the US economy (342 occupations from BLS), scored each one's AI exposure 0-10 using an LLM, and visualized it as a treemap.
if your whole job happens on a screen you're cooked.
average score across all jobs is 5.3/10.
software devs: 8-9.
roofers: 0-1.
medical transcriptionists: 10/10 💀
https://t.co/7MWRgdtLDI
I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then:
- the human iterates on the prompt (.md)
- the AI agent iterates on the training code (.py)
The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc.
https://t.co/YCvOwwjOzF
Part code, part sci-fi, and a pinch of psychosis :)
- Meet Eugene Roshal
- Creator of WinRAR
- RAR literally stands for Roshal Archive
- Avoided Silicon Valley
- Avoids media attention
- Keeps his personal life extremely private
- said software will expire After 40 days of trial but it never actually expired 😭
- It became one of the most installed software tools on Earth
- Estimated Net Worth: $100 million – $250 million
- In a world of loud tech founders
- He chose silence and still won
While the world was just getting used to personal computers..
He was building one of the most powerful compression algorithms ever made
Legend 🗿