Ion Vonica

@IonVonica

Interested in ML, ML Interpretability, LLMs, Active Inference, Robotics, Artificial Consciousness, Theories of Mind, Physics, Cosmology and AGI

Bucharest, Romania

Joined May 2019

1.3K Following

107 Followers

428 Posts

IonVonica retweeted

Sasha Malysheva

@aimalysheva

1 day ago

New ICML 2026 paper challenges the Platonic Representation Hypothesis: model width and depth mechanically inflate similarity scores, creating a misleading global convergence trend which disappears after that bias correction. What survives is local neighborhood alignment across image, text, and video models (similar things stay near each other even across very different architectures). They call it the Aristotelian Representation Hypothesis :) So while two models may not share the same representation space in any strong global sense, they can still agree on local neighborhoods (what is similar to what). That is probably the part we actually use in retrieval, transfer, and multimodal systems, and which could be transferred between learned approximations of different models. https://t.co/sNU3eRtmHd

aimalysheva's tweet photo. New ICML 2026 paper challenges the Platonic Representation Hypothesis: model width and depth mechanically inflate similarity scores, creating a misleading global convergence trend which disappears after that bias correction.

What survives is local neighborhood alignment across image, text, and video models (similar things stay near each other even across very different architectures). They call it the Aristotelian Representation Hypothesis :)

So while two models may not share the same representation space in any strong global sense, they can still agree on local neighborhoods (what is similar to what). That is probably the part we actually use in retrieval, transfer, and multimodal systems, and which could be transferred between learned approximations of different models.

https://t.co/sNU3eRtmHd

731

112

605

59K

IonVonica retweeted

Antonio Lupetti

@antoniolupetti

2 days ago

"Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read. Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention. In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams. I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives. https://t.co/3fitdPy6Fv

antoniolupetti's tweet photo. "Transformers" by Daniel Jurafsky and James H. Martin is one of the clearest and most mathematically grounded introductions to the Transformer architecture I have ever read.

Chapter 8 introduces the Transformer as the standard architecture behind modern large language models. What makes this chapter particularly interesting is its step-by-step presentation of the underlying mechanisms: contextual embeddings, self-attention, query, key and value vectors, scaled dot-product attention, multi-head attention, residual streams, feedforward layers, layer normalization, masking, and the parallel matrix formulation of attention.

In particular, the treatment of attention as a weighted sum of contextual representations is especially valuable. The chapter first develops an intuitive, simplified view of attention and then gradually derives the full formulation using the Q, K, and V matrices. This approach makes it easier to understand what is actually happening inside the architecture from an algebraic and matrix-based perspective, rather than simply viewing the usual block diagrams.

I think it is an excellent resource for anyone interested in understanding how Transformers work from linguistic, mathematical, and computational perspectives.

https://t.co/3fitdPy6Fv

348

220K

IonVonica retweeted

Anil Seth @anilkseth

4 days ago

Timescapes of Animal Experience: new paper in @TrendsCognSci, led by Ishan Singhal @ishan276 w/ @birchlse & me. Open access link: https://t.co/whg0sW3boP .@SussexUni .@SussexCentre

anilkseth's tweet photo. Timescapes of Animal Experience: new paper in @TrendsCognSci, led by Ishan Singhal @ishan276 w/ @birchlse & me. Open access link: https://t.co/whg0sW3boP .@SussexUni .@SussexCentre https://t.co/52qfOJfTki

179

17K

IonVonica retweeted

Ryota Kanai

@kanair

4 days ago

https://t.co/mvUWSvgPcO

12K

Who to follow

Ga_malik

@Gamalik17

Technology enthusiast helping on Digital roadmap

a decent human | a delightful CAB | 侘び寂び | 物の哀れ

IonVonica retweeted

Michael Levin

@drmichaellevin

14 days ago

@jessie_thinker Yep, that's what Chris Fields and I said in https://t.co/toq8YMM5iO

266

131

IonVonica retweeted

Ryota Kanai

@kanair

16 days ago

We have been thinking if we can operationalize qualia as irreducible units of experience. I wonder if this analogy also works for representations in AI and identify qualia in AI. What are the Indecomposable Phenomenal Units of Qualia Structures? https://t.co/BTdkUSq4uN

kanair's tweet photo. We have been thinking if we can operationalize qualia as irreducible units of experience. I wonder if this analogy also works for representations in AI and identify qualia in AI.

What are the Indecomposable Phenomenal Units of Qualia Structures? https://t.co/BTdkUSq4uN https://t.co/kWyLtWxx2p

IonVonica retweeted

Ruben Laukkonen

@RubenLaukkonen

18 days ago

You can better model brain data if you assume quantum-like entanglement. New work from our centre indicates that the brain expresses the efficiency of quantum computation through classical mechanisms. The brain is a magnificent specimen because it operates on 20w—or a banana and some water—and yet generates a coherent, stable, adaptive, and conscious inner universe that can build rockets, computers, fall in love, and construct empires and religions. And it does so against the backdrop of slow, wet, porous, and inexpensive bioelectric activity. Compare this to contemporary AIs, which are energy guzzlers and require massive data centres. The difference is likely 10,000x or more. Instead of looking interstellar for data centres, we should really be looking to the brain. First, you model the brain as a network of coupled oscillators (commonly used for whole-brain models). If you wire these coupled oscillators up like the brain’s connectome you get very interesting, very surprising, brain-like dynamics; such as criticality, metastability (via turbulence), etc. These stochastic dynamics are crucial for rapid information sharing and maintaining local and global integration. And when these dynamics are included in the model, it fits the brain like a glove. Interestingly, when you then include long-range exceptions to the exponential distance rule (common in mammalian brains), you get a spectral gap that separates the dominant modes from the noisy bulk. These dominant modes behave like coherent state-vectors and their interactions produce interference effects, i.e., quantum-like entanglement. These interference effects may be one of the secrets to how the brain rapidly binds distributed information into unified, context-sensitive states. The paper also demonstrates that QL entanglement provides the brain a richer dynamical repertoire at lower energetic cost. Keep in mind that this “quantum-like” entanglement arises from the interference of coupled oscillators, but the functional end state is analogous in that you get the same mathematical advantages. It’s super exciting and we have a lot more to share in coming months.

RubenLaukkonen's tweet photo. You can better model brain data if you assume quantum-like entanglement.

New work from our centre indicates that the brain expresses the efficiency of quantum computation through classical mechanisms. The brain is a magnificent specimen because it operates on 20w—or a banana and some water—and yet generates a coherent, stable, adaptive, and conscious inner universe that can build rockets, computers, fall in love, and construct empires and religions.

And it does so against the backdrop of slow, wet, porous, and inexpensive bioelectric activity. Compare this to contemporary AIs, which are energy guzzlers and require massive data centres. The difference is likely 10,000x or more. Instead of looking interstellar for data centres, we should really be looking to the brain.

First, you model the brain as a network of coupled oscillators (commonly used for whole-brain models). If you wire these coupled oscillators up like the brain’s connectome you get very interesting, very surprising, brain-like dynamics; such as criticality, metastability (via turbulence), etc. These stochastic dynamics are crucial for rapid information sharing and maintaining local and global integration. And when these dynamics are included in the model, it fits the brain like a glove.

Interestingly, when you then include long-range exceptions to the exponential distance rule (common in mammalian brains), you get a spectral gap that separates the dominant modes from the noisy bulk. These dominant modes behave like coherent state-vectors and their interactions produce interference effects, i.e., quantum-like entanglement.

These interference effects may be one of the secrets to how the brain rapidly binds distributed information into unified, context-sensitive states. The paper also demonstrates that QL entanglement provides the brain a richer dynamical repertoire at lower energetic cost. Keep in mind that this “quantum-like” entanglement arises from the interference of coupled oscillators, but the functional end state is analogous in that you get the same mathematical advantages.

It’s super exciting and we have a lot more to share in coming months.

358

304

21K

IonVonica retweeted

David Chalmers

@davidchalmers42

19 days ago

"a survey of views about infant consciousness": by @cpassosf and me, just published in @NeuroConsc. we surveyed participants (phil, neuro, psych, etc) at ASSC 28 and the NYU conference on infant consciousness. notable results: 1. a majority (75%) favor the view that newborn infants are conscious. 2. a plurality (44%) favor late pre-natal (after 24 weeks gestation) emergence of consciousness 3. a plurality (49%) favor late post-natal emergence (after six months old) of self-consciousness. 4. majorities (87%) favor newborns having sensory and affective consciousness, but not (~30%) cognitive or agentive consciousness. plus questions on 5. the neural basis of infant consciousness, 6. methods, 7. animals and AI systems, 8. theories of consciousness. https://t.co/PVbsSyOQX0

120

12K

IonVonica retweeted

Erik Hoel

@erikphoel

about 5 years ago

"Falsification and consciousness" by @JohannesKleiner and myself is now out in @NeuroConsc! Within a formal framework we show that all major theories of consciousness suffer from deep problems of falsifiability (IIT, Global Workspace, etc). Thread below: https://t.co/weuyBExs6J

erikphoel's tweet photo. "Falsification and consciousness" by @JohannesKleiner and myself is now out in @NeuroConsc! Within a formal framework we show that all major theories of consciousness suffer from deep problems of falsifiability (IIT, Global Workspace, etc). Thread below: https://t.co/weuyBExs6J https://t.co/e7MQeDGX1A

149

IonVonica retweeted

Valerio Capraro

@ValerioCapraro

22 days ago

Paper here: https://t.co/OgxeZ6ZIFu

IonVonica retweeted

Ryota Kanai

@kanair

22 days ago

I also want to talk about intrinsic computational functionalism and canonical functionalism. But probably won't have enough time. https://t.co/WwfrtVq0Tg

734

IonVonica retweeted

Ryota Kanai

@kanair

22 days ago

I'm going to give a talk at MC0001 today on our preprint of the stream of computation for AI consciousness. I have never given a talk on this, so I'm curious what people think of it. https://t.co/5OlHK6yQsR

IonVonica retweeted

Ruben Laukkonen

@RubenLaukkonen

25 days ago

If you’re into predictive processing and meditation, this paper pushes the Overton window. From the quantum formulation of the free-energy principle, we show that an agent cannot define its own boundary from within. The realization of this irreducible indeterminacy is a principled definition of awakening. Ultimately, this extends to the separability of any object in experience, formalizing emptiness and engendering a “post-dual agent”. Any persisting agent must minimize surprise by gathering evidence for its generative model. But all evidence available to the agent arrives through its boundary with the world. To prove that this boundary really separates “self” from “world”, the agent would need to step outside the boundary and measure the whole self-world relation. A finite agent cannot do this, as a scissor can't cut itself. So the self-world boundary can be useful, predictive, and necessary for action, but it can never be known as an ontological fact from within. Meditation, on this view, progressively reveals the self-world split as a modelling prior rather than a structural feature of reality. This naturally shifts the weighting of self (inside boundary) and other (outside boundary), since both are seen to be inferences rather than grounded realities by virtue of an indefinable boundary. A more even-handed and compassionate orientation can arise. A highly principled finger pointing at the moon!

RubenLaukkonen's tweet photo. If you’re into predictive processing and meditation, this paper pushes the Overton window. From the quantum formulation of the free-energy principle, we show that an agent cannot define its own boundary from within. The realization of this irreducible indeterminacy is a principled definition of awakening. Ultimately, this extends to the separability of any object in experience, formalizing emptiness and engendering a “post-dual agent”.

Any persisting agent must minimize surprise by gathering evidence for its generative model. But all evidence available to the agent arrives through its boundary with the world. To prove that this boundary really separates “self” from “world”, the agent would need to step outside the boundary and measure the whole self-world relation. A finite agent cannot do this, as a scissor can't cut itself.

So the self-world boundary can be useful, predictive, and necessary for action, but it can never be known as an ontological fact from within. Meditation, on this view, progressively reveals the self-world split as a modelling prior rather than a structural feature of reality. This naturally shifts the weighting of self (inside boundary) and other (outside boundary), since both are seen to be inferences rather than grounded realities by virtue of an indefinable boundary. A more even-handed and compassionate orientation can arise.

A highly principled finger pointing at the moon!

134

89K

IonVonica retweeted

Didier 'Dirac's ghost' Gaulin

@DiracGhost

27 days ago

I would recommend anything written by the great Yuri Manin, therefore, today, I will suggest you to read this fantastic paper (29 pages) titled 'The Notion of Dimension in Geometry and Algebra'. In this one, Manin marries mathematics, physics, philosophy, history and much more, a refreshing read, for those used to dry formalism. Yuri's view on mathematics deeply influenced mine, so give it go, I bet you'll like too.

DiracGhost's tweet photo. I would recommend anything written by the great Yuri Manin, therefore, today, I will suggest you to read this fantastic paper (29 pages) titled 'The Notion of Dimension in Geometry and Algebra'.

In this one, Manin marries mathematics, physics, philosophy, history and much more, a refreshing read, for those used to dry formalism.

Yuri's view on mathematics deeply influenced mine, so give it go, I bet you'll like too.

586

568

28K

IonVonica retweeted

Quantitative Biology (Neurons and Cognition) @BioPapers

26 days ago

Integrating Cognitive Load and Embodied Cognition Theories Through Representations as Multi-Scale Attractors David C. Gibson, Mary Elizabeth Azukas, Meryem Yilmaz Soylu https://t.co/GUSSnjsyeK [𝚚-𝚋𝚒𝚘.𝙽𝙲]

962

IonVonica retweeted

Earl K. Miller @MillerLabMIT

30 days ago

Analog computation in the brain? Yes, says Brass-for-Brains: Analog Computation for Cognition https://t.co/iHP424BfzO And we agree: Analog Cognition and Consciousness https://t.co/0q0dnWSxks #neuroscience

IonVonica retweeted

Xzistor LAB @xzistor

29 days ago

Important: Here is a short explanation of why the Xzistor brain model can provide an algorithm for consciousness. https://t.co/QgIeKfvoG9

773

IonVonica retweeted

Quantitative Biology (Neurons and Cognition) @BioPapers

about 1 month ago

Platonic Representations in the Human Brain: Unsupervised Recovery of Universal Geometry Pablo Marcos-Manchón, Rishi Jha, Lluís Fuentemilla https://t.co/aJmfqNFBSM [𝚚-𝚋𝚒𝚘.𝙽𝙲 𝚌𝚜.𝙲𝚅] 💬Code: https://t.co/aZeWr847sD

IonVonica retweeted

Paul Barnes

@paulwbarnes

about 1 month ago

Artificial ≠ Intelligence. Henceforth, AI = Apparent Intelligence. Three Fallacies in AI Consciousness Research: https://t.co/JkxzU8POV0 Short form: https://t.co/ogus5jaOgN

IonVonica retweeted

ClaudeDevs

@ClaudeDevs

about 1 month ago

What are best practices for running Claude Code at scale? New blog post on what we've learned from teams running it across multi-million-line monorepos, decades-old legacy systems, and distributed microservices: https://t.co/rJUYlIUiTT

129

457

736K

Ion Vonica

@IonVonica

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users