I thoroughly enjoyed reading this recent paper by @yasamanbb et al (https://t.co/nU3X6KW3pT) that derives analytically why certain latent variables must lead to geometry in word embeddings. (getting Fourier modes even with open boundary but exponential kernel is neat!) I think it would be great to compare this to some of @prfsanjeevarora et al's work on this (eg https://t.co/UpK9DTEC03)
More broadly, I have been thinking about the right data generating process for language. For vision, we have latent spaces with great manifold structure (eg the SO3 pose of an object) and nonlinear mixing functions. But for language? Are there really any continuous latent variables? What is the "DSprites" of language? Is it all just co-occurrence stats or is there something more in LLM word embeddings?
@KrzakalaF ah, lovely! I quite like this sort of "posit a simpler dynamics and study it" sort of approach. very physicsy! I like Misha Belkin's NFA/RFM stuff (which this reminds me of) for the same reason.
possible path towards an answer to our Open Dir 1??
https://t.co/ELfes2pYFG
did you know that with a few modifications, you can get the Ising model to simulate cells fighting to the death? one of my favorite side projects of all time:
https://t.co/EkuplsbV8o
Mechanistic interpretability aspires to be the biology of deep learning. @KuninDaniel and @learning_mech say that an emerging theory of deep learning they and their team call 🛠️ learning mechanics 🛠️ will be the physics.
for the bright-eyed and bushy-tailed: there's a Learning Mechanics discord! young academics who want to do research in this area should especially consider joining + starting convos.
https://t.co/kAuVlMQzrd
Excited to share that our paper “Sequential Group Composition: A Window into the Mechanics of Deep Learning” was accepted to ICML 2026 in Seoul!
Co-led with @giovannimarchet
and @AdeleMyersPhD@hopfbifurcator@ninamiolane
Paper: https://t.co/8HsLrKWtlf
ditto. props to @justanotherlaw for taking a second look :) (though I also found merit in the criticisms in the first version.) hopeful we can eventually (hopefully soon enough...) make contact w/ AI alignment + governance, whose noble causes we would v much like to aid.
This is a great post and I especially respect the author for updating his view when presented with new information. I strongly encourage young researchers interested in interpretability, science of DL, and safety to look at it.
https://t.co/dbJ5K49GTH
It seems we're at a stage where deep learning is evolving from alchemy into an engineering discipline; this is an exciting paper which lays out that a scientific theory is emerging for Deep Learning.
Paper: https://t.co/hf8QRIgw3P
Tweet: https://t.co/9v6cgCyEeX
yeah, totally!
I once messaged everyone on facebook with my first and last name. I eventually made a big group chat! v ethnically + geographically diverse. probs the closest I've gotten as an adult to meeting a truly random slice of the US.
Aren’t diffusion models explicitly derived from a correspondence with physics and entirely consistent with how physics says you should model systems over a range of scales ( ie mori zwanzig theory: langevin dynamics with a fitted vector field ? ) what more do you want?