New ICML 2026 paper challenges the Platonic Representation Hypothesis: model width and depth mechanically inflate similarity scores, creating a misleading global convergence trend which disappears after that bias correction.
What survives is local neighborhood alignment across image, text, and video models (similar things stay near each other even across very different architectures). They call it the Aristotelian Representation Hypothesis :)
So while two models may not share the same representation space in any strong global sense, they can still agree on local neighborhoods (what is similar to what). That is probably the part we actually use in retrieval, transfer, and multimodal systems, and which could be transferred between learned approximations of different models.
https://t.co/sNU3eRtmHd
"Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read.
Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention.
In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams.
I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives.
https://t.co/3fitdPy6Fv
We have been thinking if we can operationalize qualia as irreducible units of experience. I wonder if this analogy also works for representations in AI and identify qualia in AI.
What are the Indecomposable Phenomenal Units of Qualia Structures? https://t.co/BTdkUSq4uN
You can better model brain data if you assume quantum-like entanglement.
New work from our centre indicates that the brain expresses the efficiency of quantum computation through classical mechanisms. The brain is a magnificent specimen because it operates on 20w—or a banana and some water—and yet generates a coherent, stable, adaptive, and conscious inner universe that can build rockets, computers, fall in love, and construct empires and religions.
And it does so against the backdrop of slow, wet, porous, and inexpensive bioelectric activity. Compare this to contemporary AIs, which are energy guzzlers and require massive data centres. The difference is likely 10,000x or more. Instead of looking interstellar for data centres, we should really be looking to the brain.
First, you model the brain as a network of coupled oscillators (commonly used for whole-brain models). If you wire these coupled oscillators up like the brain’s connectome you get very interesting, very surprising, brain-like dynamics; such as criticality, metastability (via turbulence), etc. These stochastic dynamics are crucial for rapid information sharing and maintaining local and global integration. And when these dynamics are included in the model, it fits the brain like a glove.
Interestingly, when you then include long-range exceptions to the exponential distance rule (common in mammalian brains), you get a spectral gap that separates the dominant modes from the noisy bulk. These dominant modes behave like coherent state-vectors and their interactions produce interference effects, i.e., quantum-like entanglement.
These interference effects may be one of the secrets to how the brain rapidly binds distributed information into unified, context-sensitive states. The paper also demonstrates that QL entanglement provides the brain a richer dynamical repertoire at lower energetic cost. Keep in mind that this “quantum-like” entanglement arises from the interference of coupled oscillators, but the functional end state is analogous in that you get the same mathematical advantages.
It’s super exciting and we have a lot more to share in coming months.
"a survey of views about infant consciousness": by @cpassosf and me, just published in @NeuroConsc. we surveyed participants (phil, neuro, psych, etc) at ASSC 28 and the NYU conference on infant consciousness.
notable results:
1. a majority (75%) favor the view that newborn infants are conscious.
2. a plurality (44%) favor late pre-natal (after 24 weeks gestation) emergence of consciousness
3. a plurality (49%) favor late post-natal emergence (after six months old) of self-consciousness.
4. majorities (87%) favor newborns having sensory and affective consciousness, but not (~30%) cognitive or agentive consciousness.
plus questions on 5. the neural basis of infant consciousness, 6. methods, 7. animals and AI systems, 8. theories of consciousness.
https://t.co/PVbsSyOQX0
"Falsification and consciousness" by @JohannesKleiner and myself is now out in @NeuroConsc! Within a formal framework we show that all major theories of consciousness suffer from deep problems of falsifiability (IIT, Global Workspace, etc). Thread below: https://t.co/weuyBExs6J
I also want to talk about intrinsic computational functionalism and canonical functionalism. But probably won't have enough time.
https://t.co/WwfrtVq0Tg
I'm going to give a talk at MC0001 today on our preprint of the stream of computation for AI consciousness. I have never given a talk on this, so I'm curious what people think of it.
https://t.co/5OlHK6yQsR
If you’re into predictive processing and meditation, this paper pushes the Overton window. From the quantum formulation of the free-energy principle, we show that an agent cannot define its own boundary from within. The realization of this irreducible indeterminacy is a principled definition of awakening. Ultimately, this extends to the separability of any object in experience, formalizing emptiness and engendering a “post-dual agent”.
Any persisting agent must minimize surprise by gathering evidence for its generative model. But all evidence available to the agent arrives through its boundary with the world. To prove that this boundary really separates “self” from “world”, the agent would need to step outside the boundary and measure the whole self-world relation. A finite agent cannot do this, as a scissor can't cut itself.
So the self-world boundary can be useful, predictive, and necessary for action, but it can never be known as an ontological fact from within. Meditation, on this view, progressively reveals the self-world split as a modelling prior rather than a structural feature of reality. This naturally shifts the weighting of self (inside boundary) and other (outside boundary), since both are seen to be inferences rather than grounded realities by virtue of an indefinable boundary. A more even-handed and compassionate orientation can arise.
A highly principled finger pointing at the moon!
I would recommend anything written by the great Yuri Manin, therefore, today, I will suggest you to read this fantastic paper (29 pages) titled 'The Notion of Dimension in Geometry and Algebra'.
In this one, Manin marries mathematics, physics, philosophy, history and much more, a refreshing read, for those used to dry formalism.
Yuri's view on mathematics deeply influenced mine, so give it go, I bet you'll like too.
Integrating Cognitive Load and Embodied Cognition Theories Through Representations as Multi-Scale Attractors
David C. Gibson, Mary Elizabeth Azukas, Meryem Yilmaz Soylu
https://t.co/GUSSnjsyeK [𝚚-𝚋𝚒𝚘.𝙽𝙲]
Analog computation in the brain? Yes, says
Brass-for-Brains: Analog Computation for Cognition
https://t.co/iHP424BfzO
And we agree:
Analog Cognition and Consciousness
https://t.co/0q0dnWSxks
#neuroscience
Platonic Representations in the Human Brain: Unsupervised Recovery of Universal Geometry
Pablo Marcos-Manchón, Rishi Jha, Lluís Fuentemilla
https://t.co/aJmfqNFBSM [𝚚-𝚋𝚒𝚘.𝙽𝙲 𝚌𝚜.𝙲𝚅]
💬Code: https://t.co/aZeWr847sD
Artificial ≠ Intelligence.
Henceforth, AI = Apparent Intelligence.
Three Fallacies in AI Consciousness Research: https://t.co/JkxzU8POV0
Short form: https://t.co/ogus5jaOgN
What are best practices for running Claude Code at scale?
New blog post on what we've learned from teams running it across multi-million-line monorepos, decades-old legacy systems, and distributed microservices:
https://t.co/rJUYlIUiTT