Saranraj Nambusubramaniyan @memoryencoder - Twitter Profile

about 1 month ago

Your drifting model is secretly a fixed point for the Wasserstein gradient flow on... ...the KL? ...an approximation to the Sinkhorn? ...Is it even a Wasserstein gradient flow at all? https://t.co/QJLh86Hi0d @liwenliang @agalashov @JamesTThorn @ValentinDeBort1 @ArnaudDoucet1

ArthurGretton's tweet photo. Your drifting model is secretly a fixed point for the Wasserstein gradient flow on...
...the KL?
...an approximation to the Sinkhorn?
...Is it even a Wasserstein gradient flow at all?

https://t.co/QJLh86Hi0d

@liwenliang @agalashov @JamesTThorn @ValentinDeBort1 @ArnaudDoucet1 https://t.co/6O1HAYXXKY

2

442

82

358

64K

memoryencoder retweeted

Marco Fumero @marco_fumero

about 1 year ago

Inductive biases in the training process cause the formations of attractors in the latent vector field, characterizing memorization and generalization regimes of the network.

marco_fumero's tweet photo. Inductive biases in the training process cause the formations of attractors in the latent vector field, characterizing memorization and generalization regimes of the network. https://t.co/2ko94QLhyn

1

15

4

3

3K

memoryencoder retweeted

Stat.ML Papers @StatMLPapers

2 months ago

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models https://t.co/S0QaAgh5tT

0

368

52

351

24K

memoryencoder retweeted

CLaE

@leafs_s

2 months ago

The Brain May Have Been Critical Before It Became Predictive: Self-Organized Criticality and the Physical Basis of Action-Readiness https://t.co/8PP8WguoKS

0

94

24

65

9K

Who to follow

Konstantin Willeke @ CVPR

@KonstantinWille

Leading brain foundation models @metamorphiclabs. Prev: @Stanford, @MPI_IS, @sinzlab.

Satpreet (Sat) Singh

@tweetsatpreet

AI/Agents/Simulation x Neuro/Bio Postdoc @harvardmed @KempnerInst; PhD @UW; Ex @Meta @LinkedIn. Personal account.

Abhi Yadav

@abhishekyd

AI Entrepreneur - operator - Investor - learner - problem solver & venture builder with #AI #CX #Identity #cdp - Adrenaline junkie #MITalum #Patriotfan #Giver

memoryencoder retweeted

SIAM Activity Group on Dynamical Systems @DynamicsSIAM

3 months ago

Tutorial article: "Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization" (by Arthur N. Montanari, Francesco Bullo, Dmitry Krotov, Adilson E. Motter): https://t.co/C8sieU71Gp

2

74

17

53

4K

memoryencoder retweeted

Christos Constantinidis @c_constan

3 months ago

New preprint from the lab! Prefrontal Mechanisms of Rule Learning https://t.co/rfPJZ17sm8 (1/10)

1

39

14

20

4K

memoryencoder retweeted

Ido Amos @AmosaurusRex

4 months ago

Can LLMs reason internally while processing their inputs, similar to how humans can think ahead as we process information? Our latest work introduces Thinking States, a novel architectural adaptation that transforms reasoning into a internal recurrent process. By training models to maintain a dynamic thinking state, we achieve significant inference speedups over Chain-of-Thought while substantially outperforming existing latent reasoning methods. Paper: https://t.co/nXJ9szfbrT

AmosaurusRex's tweet photo. Can LLMs reason internally while processing their inputs, similar to how humans can think ahead as we process information? Our latest work introduces Thinking States, a novel architectural adaptation that transforms reasoning into a internal recurrent process.
By training models to maintain a dynamic thinking state, we achieve significant inference speedups over Chain-of-Thought while substantially outperforming existing latent reasoning methods.

Paper: https://t.co/nXJ9szfbrT

5

132

27

105

13K

memoryencoder retweeted

Carlos E. Perez

@IntuitMachine

3 months ago

With the emergence of AI, we need to change the question from "what is intelligence?" to "what is agency?" Here is my model of agency:

5

35

10

40

3K

memoryencoder retweeted

Simone Foti @simo_foti

3 months ago

It's time to bring 3D meshes into modern machine learning properly! 🛸 Our work solves the non-differentiability of the Exp map on meshes, enabling gradients to flow directly through geodesics. It’s differentiable, GPU-fast, and fully parallelised. https://t.co/2G5Q7zrck3

7

684

77

609

69K

memoryencoder retweeted

Anna Ciaunica PhD @annaciaunica.bsky.social

@AnnaCiaunica

about 1 year ago

Cooked a new paper 😎 The No Body Problem - Intelligence and Selfhood in Biological and Artificial Systems https://t.co/LU8XBVjOXD

AnnaCiaunica's tweet photo. Cooked a new paper 😎

The No Body Problem - Intelligence and Selfhood in Biological and Artificial Systems

https://t.co/LU8XBVjOXD https://t.co/bTNC6ad2Ya

15

167

35

94

13K

memoryencoder retweeted

Aran Nayebi

@aran_nayebi

4 months ago

1/ As AI agents become increasingly capable, what must *inevitably* emerge inside them? We prove selection theorems: strong task performance forces world models, belief-like memory and—under task mixtures—persistent variables resembling core primitives associated with emotion.

aran_nayebi's tweet photo. 1/ As AI agents become increasingly capable, what must *inevitably* emerge inside them?

We prove selection theorems: strong task performance forces world models, belief-like memory and—under task mixtures—persistent variables resembling core primitives associated with emotion.

3

217

38

168

42K

memoryencoder retweeted

The Brain Prize

@BrainPrize

4 months ago

Meet the winners of The Brain Prize 2026. The announcement film introduces Professors David Ginty and Patrik Ernfors, awarded the prize for their groundbreaking discoveries on the neuroscience of touch and pain. Learn more about the winners: https://t.co/eTGOMwfmqO

1

34

13

5

2K

memoryencoder retweeted

PNASNews @PNASNews

5 months ago

One of the most-viewed PNAS articles in the last week is “Quantifying the compressibility of the human brain.” Explore the article here: https://t.co/HsWpARBhH4 For more trending articles, visit https://t.co/l4XOkRLLmY.

PNASNews's tweet photo. One of the most-viewed PNAS articles in the last week is “Quantifying the compressibility of the human brain.” Explore the article here: https://t.co/HsWpARBhH4

For more trending articles, visit https://t.co/l4XOkRLLmY. https://t.co/2EAIhwNGTe

4

238

36

173

48K

memoryencoder retweeted

Learning Systems and Robotics Lab (is hiring!) @learnsyslab

4 months ago

Our work, "A Primer on SO(3) Action Representations in Deep Reinforcement Learning," was accepted to #ICLR2026! We provide a systematic study of action representation choices in RL, showing that they fundamentally impact training stability and performance. #Robotics #AI #RL

3

470

53

351

50K

memoryencoder retweeted

elvis

@omarsar0

4 months ago

First empirical study on how developers are actually writing AI context files across open-source projects. Researchers scanned 10,000 repositories and found only 466 (5%) have adopted AI configuration files like AGENTS dot md, CLAUDE dot md, or Copilot instructions. Why does it matter? Of the 155 AGENTS dot md files analyzed, 50% were never modified after the initial commit. Only 6% had 10 or more revisions. The most common content in these files were conventions, contribution guidelines, and architecture overviews. But there is no standard structure, wide variation in what teams encode, and most files are written once and left to decay. The conventions for this new form of documentation are still in flux. Paper: https://t.co/YkSayPUesC Learn to build effective AI agents in our academy: https://t.co/U0ZuNA084v

omarsar0's tweet photo. First empirical study on how developers are actually writing AI context files across open-source projects.

Researchers scanned 10,000 repositories and found only 466 (5%) have adopted AI configuration files like AGENTS dot md, CLAUDE dot md, or Copilot instructions.

Why does it matter?

Of the 155 AGENTS dot md files analyzed, 50% were never modified after the initial commit. Only 6% had 10 or more revisions.

The most common content in these files were conventions, contribution guidelines, and architecture overviews.

But there is no standard structure, wide variation in what teams encode, and most files are written once and left to decay.

The conventions for this new form of documentation are still in flux.

Paper: https://t.co/YkSayPUesC

Learn to build effective AI agents in our academy: https://t.co/U0ZuNA084v

42

332

62

394

55K

memoryencoder retweeted

Zhuoran Yang

@zhuoran_yang

4 months ago

New Paper -- "On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking" We give a complete mechanistic and dynamic picture of how neural networks learn modular addition f(x,y) = (x+y) mod p. We answer three questions: (1) What does the trained network compute? (2) How do Fourier features emerge during training? (3) Why does grokking happen? Each answer comes with a mathematical characterization backed by theory and experiments. Paper: https://t.co/uCU4eNr7tH Blog: https://t.co/Ok9dDQcYvw Demo: https://t.co/SX4Q6IsLHB Code: https://t.co/03Adj6E3Bt

zhuoran_yang's tweet photo. New Paper -- "On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking"

We give a complete mechanistic and dynamic picture of how neural networks learn modular addition f(x,y) = (x+y) mod p. We answer three questions:
(1) What does the trained network compute?
(2) How do Fourier features emerge during training?
(3) Why does grokking happen?
Each answer comes with a mathematical characterization backed by theory and experiments.

Paper: https://t.co/uCU4eNr7tH
Blog: https://t.co/Ok9dDQcYvw
Demo: https://t.co/SX4Q6IsLHB
Code: https://t.co/03Adj6E3Bt

7

327

52

257

38K

memoryencoder retweeted

Jason Stock

@itsstock

5 months ago

Generative Models via Drifting in MLX. A minimal single file implementation showing how we evolve a pushforward distribution with a drifting field during training for single-step inference. No ODE/SDE solvers, adversarial loss, or complex nets. code: https://t.co/7eea96wToy

itsstock's tweet photo. Generative Models via Drifting in MLX.

A minimal single file implementation showing how we evolve a pushforward distribution with a drifting field during training for single-step inference. No ODE/SDE solvers, adversarial loss, or complex nets.

code: https://t.co/7eea96wToy https://t.co/5hgETiBfs4

14

327

36

207

23K

memoryencoder retweeted

Rosinality @rosinality

5 months ago

JiT + MeanFlow!

2

371

44

254

19K

memoryencoder retweeted

george

@georgeyw_

5 months ago

“Neural networks are grown, not programmed” We’re changing that. Mechinterp investigates how models generalize beyond their training data by studying the resulting internal structure. We introduce patterning as the dual: given desired structure, determine what data produces it.

georgeyw_'s tweet photo. “Neural networks are grown, not programmed”

We’re changing that. Mechinterp investigates how models generalize beyond their training data by studying the resulting internal structure. We introduce patterning as the dual: given desired structure, determine what data produces it. https://t.co/DMjvVrtgpN

19

1K

157

957

92K

memoryencoder retweeted

Andrew Lampinen @AndrewLampinen

5 months ago

Is it better to go to academia or industry for research in AI or cognitive science? It's the most frequent question I get asked by PhD students, and I've finally written up some of my thoughts on the answers, as an epilogue to my more research-focused posts on cog sci and AI:

AndrewLampinen's tweet photo. Is it better to go to academia or industry for research in AI or cognitive science? It's the most frequent question I get asked by PhD students, and I've finally written up some of my thoughts on the answers, as an epilogue to my more research-focused posts on cog sci and AI: https://t.co/qoIqBKGqX6

7

436

46

419

38K

Saranraj Nambusubramaniyan

@memoryencoder

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users