LLMs learn by predicting tokens. World models (JEPA, data2vec) learn by predicting their own abstractions. Which needs more data? For data with hidden hierarchy, we prove the gap is exponential. https://t.co/r2uuX0lBCu
Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
https://t.co/c9AvsRKybj
What if we didn’t have to hold an entire neural network in memory to train it?
Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network.
In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance.
With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block.
How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently.
We validated this across five different architectures:
• ViT
• DiT
• Masked diffusion
• Autoregressive transformers
• Recurrent-depth transformers
In each case, performance is competitive with end-to-end training while using a fraction of the memory.
This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training.
Read our paper and code, to learn more.
Paper: https://t.co/CRj96VGYQn
GitHub: https://t.co/eNW0K9Xh8E
🐟
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
there’s a dude on youtube making a video with a burst blood vessel in his eye. the title is “software developer driven to insanity by 2026 job market” and he’s pointing at his eye as evidence of the stress he’s going through.
garrett’s a rather normal dude and there were no obvious tells as to why this guy has not been able to find a job for 9 months since he was laid off in july.
but there was a hint.
at one point he describes how he went really far in an interview process, and he was absolutely sure he was gonna get the job. they hire someone else and the dude is incredulous. he asks them wtf!? they show him the winning candidate entry.
garrett used 50% ai and 50% manual coding. he was sitting there being thoughtful and deliberate and properly naming functions and variables and designing modular architectures. meanwhile the winning application was a dude who used 100% ai and his app had a shit ton more features. and he delivered it faster.
so, that’s the kind of dude being hired now. garrett didn’t get the job. but someone else did. so the job is there. the job was open. just not for you.
the thing getting you hired in 2019, like obsessing over code quality and maintainability and best practices, will get you fired in 2026.
the market has spoken. if you want to land a dev job, make more slop.
they want more features done faster at a lower cost.
your job is not to argue. it is to provide the thing they are seeking.
pick 3 companies you really wanna work for, then build their entire app in a weekend as your cover letter. applying to doordash? build their 3-sided market place in 2 hours. applying to google? build your own search engine that does a 2-hour web crawl.
reading code is a dead thing. no one reads their own code anymore, let alone yours. they will run your app. and they’ll judge you off that.
build more slop. build a slop empire. imagine more slop. be the slop. slop slop slop.
get the job. make money. give your boss the slop he wants. don’t argue. slop is peace. slop is strength. inhale the slop. exhale it.
oh you wanna be an artist? cute. go code on a city street like a bucket drummer. you code by hand for tips now. otherwise, off to the slop factory.
enjoy it! enjoy the sloppagaden! make a little money!
Researchers asked every major AI model (GPT-5, Claude, Gemini, Grok) for business strategy advice. 30,000 times.
Every model gave the same answer, every time, regardless of context.
Differentiate. Collaborate. Think long-term. Augment.
They changed the prompts and changed the industries. They even tried bribing the models with rewards. The bias barely moved.
They called it "trendslop", AI's tendency to recommend whatever sounds good on LinkedIn instead of what actually works for your situation.
Why? These models are trained on the entire internet. Every Reddit post, every TED Talk, every Medium article from a guy with 11 subscribers. They don't reason. They regurgitate the most popular opinions in the most convincing voice possible.
We thought AI was going to make everyone a genius. Instead it's pulling smart people toward the middle.
https://t.co/FfMRbI5dbN
Hello, Moon. It’s great to be back.
Here’s a taste of what the Artemis II astronauts photographed during their flight around the Moon. Check out more photos from the mission: https://t.co/rzM1P0QbOl
Ok last one: the rarest solar eclipse of all time. Only 4 people have seen this with their naked eyes. The sun is fully behind the moon. The only faint light hitting the near side is reflecting off of earth, 250,000 miles away. And the stars and galaxies in the background, sheesh
Nikon Z9
f/2.0
2 second exposure
ISO 1600
@NASA: https://t.co/twBqbUEDs2
New AI paper from us this week. When my student first showed me his initial findings, I really didn’t know what to make of them. I felt that this was an interesting but curious loophole phenomenon that would shortly be closed. I was very wrong.
https://t.co/H3YIyl01FR
Almost every AI power user I know is MORE stressed and busier after using AI, not less
What people thought AI would do: 10x productivity so that we can finish work earlier & relax more
What it’s actually doing: 10x productivity so that we end up with 20x more things to do cos of the sheer possibilities
This indie dev is making a game where you can literally play as Ancient Egyptian wall art.
- Switch between a 3D archaeologist and living 2D art
- Survive puzzles that fight back
- Progress by mastering both worlds
It’s called Fresco. This mechanic is genius.
Here’s what’s gonna happen:
- you replace your code review with feedback loops (sentry, datadog, support tickets, etc)
- you stop reading the code
- software factory fixes everything
- one day something breaks at 3am, agent can’t fix it
- nobody’s read the code in 3 months
- you have 3 weeks of downtime trying to re-onboard and fix it
- you lose significant % of your contracts and users
- your company is now dead