Now, ComicTL has finally released the stable version. You can now choose to fully run it locally or use Gemini in cloud mode. I also made a website for it, you can read it at https://t.co/CRAXTFijfx.
Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
https://t.co/c9AvsRKybj
What if we didn’t have to hold an entire neural network in memory to train it?
Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network.
In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance.
With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block.
How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently.
We validated this across five different architectures:
• ViT
• DiT
• Masked diffusion
• Autoregressive transformers
• Recurrent-depth transformers
In each case, performance is competitive with end-to-end training while using a fraction of the memory.
This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training.
Read our paper and code, to learn more.
Paper: https://t.co/CRj96VGYQn
GitHub: https://t.co/eNW0K9Xh8E
🐟
Ah, CVE-2026-31431. For the first time, I might be rethinking Linux and AI. Damn, it 732 Bytes of Python can make you access root access from user. Fk insane founding.
Now, ComicTL has finally released the stable version. You can now choose to fully run it locally or use Gemini in cloud mode. I also made a website for it, you can read it at https://t.co/CRAXTFijfx.
Finally, the long project that I almost forgot got into beta release. You can try it by downloading the artifact from GitHub. https://t.co/w9IefZCXYe. Currently, it is only available with Gemini translation. The stable release will try to include the local LLM as a translator.
LinkedIn is a positively toxic echo chamber. Everyone licks each other's boots in the comments, saying "nice insight" because they are too afraid to correct wrong math. Politeness is just a mask for incompetence. Stop trusting vibes and start reading the docs. (12/12)
This is why I hate LinkedIn. Too many people with glorified titles but they confidently post trash because they can’t even read properly. An "AI Engineer" promoting a custom training loop for imbalance that is actually tutorial how to sabotage your own model. (1/12)
But because Focal Loss doesn't look as "complex" as a 40-line custom filtering pipeline, people think it is less professional. Complexity is not a feature; it is a bug. A senior engineer deletes code; they don't add unreadable dataset mutilation. (11/12)