We introduce epiplexity, a new measure of information that provides a foundation for how to select, generate, or transform data for learning systems. We have been working on this for almost 2 years, and I cannot contain my excitement! 1/7
📝 New paper! Two strategies have emerged for controlling LLM behavior at inference time: in-context learning (ICL; i.e. prompting) and activation steering. We propose that both can be understood as altering model beliefs, formally in the sense of Bayesian belief updating.
1/9
1/12 How do animals build an internal map of the world? In our new paper, we tracked thousands of neurons in mouse CA1 over days/weeks as they learned a VR navigation task. @nspruston@HHMIJanelia, w/ co-1st author @JohanWinn
Video summary: https://t.co/mMvsT8TVc9
Paper: https://t.co/35k5tWdvOe
🗣️Language is widely distributed throughout the brain🧠
In a recent correspondence in @NatRevNeurosci, we suggest that there is no 'language network' in the brain. What appears as such is an inevitable illusion created in part by the methods we use.
https://t.co/TCyEQCKcLJ
@gershbrain George Ellis has some of the best writing on this, but I'm still left pretty skeptical about the reality of "strong" emergence (though quasiparticles are a cool example). On the other hand "weak" emergence is relatively common, but only speaks to computational limits of modeling
@StuartHameroff would be good to replicate/extend this. measuring this in vivo, particularly during learning, would be a game changer https://t.co/2oFNjixtA3
Happy to share a new preprint - the culmination of the past three years of postdoc with @tyrell_turing and @apeyrache, and my first real foray into #NeuroAI as a tool to study the sleeping brain:
https://t.co/GOWTqj6Cif
(1/🧵) c’est parti!
@quiltydunn@ElliotMurphy91 it’s not unreasonable, but this doesn’t offer a concrete reason why they supposedly won’t (while they are so flexible they in fact could). Transformers learn all sorts of arbitrary structures/rules. why are these structures not (in principle) derivable from language statistics?
@ElliotMurphy91 This allows tracking polysemy and syntactic structure in a way that systems like GloVe or word2vec could not. They don't learn a single static similarity space for individual words, but adapt representations based on use of words in context, factoring in structural features 2/2
@ElliotMurphy91 I'm not sure the critique that "there is much more to meaning (e.g., complex polysemy) than what can be read off from matrix multiplications and co-occurrence statistics" is a fair characterization of the context dependent processing present in Transformer based LLM's. 1/2
We developed a biologically-plausible model of neuron-astrocyte networks, which in a certain limit describes a Dense Associative Memory with large storage capacity. This correspondence raises an intriguing possibility that memories might be stored, at least partially, in the network of astrocyte’s processes (inside the astrocyte), as opposed to the more conventional idea of storing memories just in synapses. #cosyne24 #cosyne2024
Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
How does human memory search work? The hippocampus and cerebellum are key for cognitive foraging. New paper w/ @NancyBLundin, @swazi65, more https://t.co/WgI3IvvWge
In our new paper at Cerebral Cortex, we find that the brain may connect related experiences by reactivating semantic knowledge in medial prefrontal cortex and integrating knowledge about both experiences in hippocampus. 🧵 https://t.co/W1Ceesfc6S @NealWMorton
Our new paper at Cerebral Cortex finds that representations of space in hippocampus and ventromedial prefrontal cortex (vmPFC) bias how we make decisions in time, demonstrating our ability to generalize knowledge across cognitive domains. 1/6
https://t.co/wOfvhk1ng7