For intelligence, compression is not the goal. It is a means to an end. The true goal is to gain information that helps reduce uncertainty, which is entirely measurable. For any particular data of interest, to obtain a most informative representation of its distribution (also called memory or knowledge), an intelligent system tries to learn the most effective and efficient compressing operations (say layers of a network). This is what I have been saying: We learn to compress, and we compress to learn! This is precisely the main theme of our new open book.
Topos are mainly seen in pure mathematics and logic, but there is also an approach to physics in which topos are used!
So, if you're confident about your understanding of basic category theory, check out this 100 page primer titled ' An Introduction to Topos Physics' by Tsatos.
This is a very gentle introduction, well written and pedagogically sound.
If you need more material on topos along side this primer, check out Goldblatt's Topoi, the categorical analysis of logic and sheaves in geometry and logic by Mac Lane and Moerdijk.
🔗👇
For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall.
We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal.
This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (https://t.co/PK5h0mqQSo), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.
I would recommend anything written by the great Yuri Manin, therefore, today, I will suggest you to read this fantastic paper (29 pages) titled 'The Notion of Dimension in Geometry and Algebra'.
In this one, Manin marries mathematics, physics, philosophy, history and much more, a refreshing read, for those used to dry formalism.
Yuri's view on mathematics deeply influenced mine, so give it go, I bet you'll like too.
In our new paper, we reinterpret tokenisation as a problem in high-dimensional geometry (100M dims to be precise!), which we can solve efficiently to get a globally near-optimal tokeniser! Our method consistently improves language models over BPE. See 🧵for details.
This is genuinely hilarious.
Some anonymous person on 4chan, responding to an anime watch order question, posted a proof that later turned out to be mathematically correct and significant.
- It was posted in under an hour after the question.
- The poster basically said, “please check for loopholes.”
- It sat mostly unnoticed for seven years.
- Later, actual mathematicians checked it and were like: yeah, this is legit.
- The formal paper literally lists the author as “Anonymous 4chan Poster.”
🌍Today we release Mosaic, a probabilistic weather model that shifts the Pareto frontier of ML weather forecasting.
It matches the skill of state-of-the-art models while generating a 24-member, 10-day global forecast in under 12 s on a single H100.
Thread!
it’s in gemini, just create it in ai studio. oh, that’s for your personal google one account. for workspace you need gemini business. no, not gemini advanced, that’s ai pro now. unless you need ai ultra. oh agents? you do that in spark actually. no, not gemini api managed agents, that’s different. for coding use jules. unless you mean the agentic ide, that’s antigravity. no, that’s the old antigravity, download the new one. actually gemini cli is being deprecated, use antigravity cli. no the flash model is smarter than the pro model. unless you need pro. if it’s video, use flow. no, flow uses veo. no, nano banana is images. actually that’s in gemini now. unless you’re in search, then it’s ai mode. no, research is notebooklm. anyway it’s all very simple.
AI has now solved a major open problem -- one of the best known Erdos problems called the unit distance problem, one of Erdos's favourite questions and one that many mathematicians had tried.
https://t.co/SD1vVPkrHR
For the last few months I've been working on a from-scratch implementation of AlphaGo, a 2016 AI breakthrough that inspired me to get into deep learning. My casual understanding of AlphaGo was "search-augmented deep neural networks trained with self-play", but I wanted to go deeper and understand it by creating it.
Frontier deep learning research has always been expensive, but any given capability gets cheaper very quickly. In 2026, you no longer need DeepMind's resources to train a strong Go AI - you can vibe code all of it yourself for just a few thousand dollars of rented compute.
It was a huge honor to be invited to teach this with @dwarkesh_sp on @dwarkeshpodcast
I am an AlphaGo & Go apprentice, not a master, so all factual errors in the podcast are mine.
Web version of tutorial: https://t.co/Xkf9VsgtuT
Code: https://t.co/rWKOwclPDg
Play the go bot here: https://t.co/aVglJXldVX
Want to read a masterpiece?
Try 'Partial Differential Equations of Physics' by Robert Geroch, a wonderful paper (in 52 pages) which is of course publicly available on arXiv.
🔗👇
The famously intimidating field of metamathematics analyzes math proofs. For example: Why are some problems hard to solve, while others are straightforward? A recent proof shows that three distinct theorems are logically equivalent. https://t.co/dxgKeNNJPq
If you've been struggling learning category theory, you might want to check out Paolo Perrone's 'Notes on Category theory: with examples from basic mathematics' available publicly on arXiv.
These notes were produced during a class given to a diverse set of scientists (including chemists and physicists), with knowledge in linear algebra being the only subject assumed to be known!
🔗👇
This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc.
More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage:
1) raw text (hard/effortful to read)
2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default
3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default
...4,5,6,...
n) interactive neural videos/simulations
Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral https://t.co/z21CP5iQfu
There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen.
TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.