Sheetal Jadhav

@_curiousneuron

One heck of a starry-eyed hooman studying the brain

Mumbai, India

Joined August 2015

502 Following

523 Followers

527 Posts

Pinned Tweet

Sheetal Jadhav @_curiousneuron

2 months ago

A little late but moving over to blue sky. If you wish to connect catch me at @sheetal.jadhav.bsky.social https://t.co/2g6qCFo2FK

_curiousneuron retweeted

Dileep George

@dileeplearning

about 1 month ago

Don’t listen to the skeptics and naysayers. If you are not using LLM coding agents you are missing out. Ofc they won’t work on everything and you need to be careful, but work is a lot more fun with coding agents.

_curiousneuron retweeted

HHMI

@hhmi_science

20 days ago

Janelia is built for this moment. Starting today, we're taking on 2 big bets: cracking how a vertebrate brain generates behavior & building AI-in-the-Loop — a new way of doing biology. Tiny transparent fish + 1 of neuroscience’s biggest questions? Let's go https://t.co/lqq1SoZCFZ

12K

_curiousneuron retweeted

Oded Rechavi

@OdedRechavi

about 1 month ago

Most experiments fail, and negative results rarely get published. This means LLMs are unaware of the outcomes of most experiments.

150

747

712

584K

Who to follow

Dhruv Batra

@DhruvBatra_

Co-founder & Chief Scientist @yutori_ai. Prev: Senior Director leading FAIR Embodied AI @MetaAI and Professor @GeorgiaTech.

Kevin Patrick Murphy

@sirbayes

Research Scientist at Google DeepMind. Interested in Bayesian Machine Learning.

Omar Sanseviero @RAISE Paris

@osanseviero

MTS at @GoogleDeepMind Building Gemini, Gemma, AI Studio and more. My views ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽

_curiousneuron retweeted

Dileep George

@dileeplearning

about 1 month ago

I can believe this. You really need to be careful when using LLMs. Those who believe hallucination is a solved problem are on hallucinogens or aren’t discerning enough.

Sheetal Jadhav @_curiousneuron

about 1 month ago

About time 🧠👾👩🏻‍💻

Kenneth D Harris @kennethd_harris

about 2 months ago

Introducing the International Brain Lab AI Agent: an experimental tool that helps researchers analyze neural activity across the mouse brain using AI coding agents. Please try it — we would love your feedback! https://t.co/uy0PRGmSuP

kennethd_harris's tweet photo. Introducing the International Brain Lab AI Agent: an experimental tool that helps researchers analyze neural activity across the mouse brain using AI coding agents.
Please try it — we would love your feedback!

https://t.co/uy0PRGmSuP https://t.co/zTOtyINq3G

120

18K

_curiousneuron retweeted

Ash Jogalekar

@curiouswavefn

about 2 months ago

I've always loved the pragmatic positivity that Jensen brings to discussions about AI. Very different from the gloom-and-doom from many AI leaders that's turning the public against the technology. We need more AI ambassadors like Jensen.

_curiousneuron retweeted

Adrienne Fairhall @alfairhall

about 2 months ago

In systems neuroscience, this is simply not the case. Paradigms are challenged and updated all the time.

15K

_curiousneuron retweeted

Konstantin Willeke @ CVPR @KonstantinWille

2 months ago

🧠Introducing OmniMouse: One of the largest datasets in neuroscience ever assembled along with a systematic study of scaling properties of brain models Co-led with🤩@pollytur1 @alexrgil14 3M neurons, >150B tokens from @AToliasLab @stanford, @alxecker @sinzlab @uniGoettingen 🧵

192

114

19K

_curiousneuron retweeted

Micah G. Allen

@micahgallen

3 months ago

The geometry of neural dynamics along the cortical attractor landscape directly reflects changes in attention, as large-scale brain activity shifts across its hills and valleys depending on the state. https://t.co/CG8zH3kx1V

micahgallen's tweet photo. The geometry of neural dynamics along the cortical attractor landscape directly reflects changes in attention, as large-scale brain activity shifts across its hills and valleys depending on the state.
https://t.co/CG8zH3kx1V https://t.co/CnOY2QKroe

172

125

26K

_curiousneuron retweeted

Doris Tsao

@doristsao

3 months ago

This is the strongest ephys evidence so far for a generative model in the brain that I know of. Congratulations @WadiaVarun! Wonderful collaboration with @UeliRutishauser on science that could only be done in humans. And please check out Fig. 5FG. This is new since biorxiv and really surprised me: the mean response to imagery and viewing is actually the same & there are many cells that respond only during imagery--challenging the idea that signal strength is what distinguishes reality from imagination.

220

148

42K

_curiousneuron retweeted

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)

@rao2z

4 months ago

World Models: The old, the new and the wishful #SundayHarangue There is a lot of chatter about world models of late--even more than can be explained by Yann betting his entire new enterprise on it. I was going to comment on this clamor in my class this week, and thought I will preview it here first.. 😋 World Models are of course by no means new--whether learned or provided, they have been the backbone of decision making problems--be it control theory or #AI--for nearly a century. Russell & Norvig's Intro to AI text book *starts* with world model as an integral part of an agent architecture (see below). A fortuitous by-product of the focus on world modes is the crash course post-#alexnet #ML young'uns maybe getting to core #AI concepts: how hierarchical models of the world and mental simulation at differing abstractions help with long range planning.. Because the current world model craze has generally been ahistoric, it confounds multiple things, IMHO. Resolution vs. Abstraction: Perhaps the most important is on their intended purpose. Are they meant to "construct" believable synthetic worlds--thus requiring be CGI-level high fidelity Or are they meant to help the agent to efficiently mentally simulate evolution of its world--conditioned on its own and other agent's actions--to support long range planning and decision making. A large part of the current work on world models--especially that based purely on video and sensory data--seems to conflate it. While it may seem that having a high fidelity world building model should also help in long range decision making, it is quite likely that the computational tradeoffs--between hi-res and abstraction tend to make them of questionable use for long range planning. Faster roll out (mental simulation) and higher resolution are quite often at loggerheads.. Disjunction and Abstraction: Having mentioned "hierarchy" and "abstraction" multiple times, I feel it is worth pointing out that at its core abstraction is a form of disjunction. An agent reasoning with the abstract models is basically reasoning over a disjunction of many distinct concrete futures--that are all roughly equivalent from the point of view of the goals of the agent. The connection to disjunction and abstraction is a powerful one that is not often acknowledged. An abstract action is a disjunction over concrete courses of action--thus leading to a disjunction of world states. A learned latent variable has similar disjunction semantics. For example, in a transformer-like architecture, a latent variable can be seen as a distribution over concrete tokens. Role of language and Symbolic abstractions: While in theory it is possible to learn world models with hierarchical abstraction (e.g. with latent variable models), ignoring the linguistic data--which is after all the corner stone of human civilization--fails to leverage the abstractions we humans have developed over the millennia. Planning, of the kind I am fond of, is possible because the models are at a significantly higher level of abstraction than pixels, or even any latent variable learned models can provide in the near future. While the planning models of yore were written by humans, there is a way of avoiding that bottleneck. Our linguistic data already sort of captures of humanity's abstractions over video data--or what I like to call "space time signal tubes" (c.f. https://t.co/77YAXUX31y & https://t.co/sUDSOnXDhB ). So, as much as I agree with the argument that language may not by itself lead to effective world models, I also equally believe that getting to the right level of abstraction from pixel stream data--while theoretically possible (in that we the humanity and evolution seem to have done it), is going to be awfully slow--especially when we have the human abstractions, however imperfect, are readily available in the language data. A powerful way, it seems to me, is to complement these symbolic and pixel level WMs.. The tradeoff is either "important parts only, but can do long range prediction" vs. "full resolution, but not long enough range". Humans seem to use language vs. visual priors for these two, which argues for an approach that uses both types of data in learning world models. Internal Abstractions and Alignment Problem: Even if the efficiency is not an issue, another critical concern about learning purely from sensory data aligning the agents using those models to humans. There is no a priori reason that the abstractions learned internally from the sensory data by an agent would have any natural correspondence to those that humans use. To the extent we want artificial agents with learned world models to be easily aligned to us humans, taking the inductive biases present in the linguistic data seems like a smarter move (c.f. https://t.co/ebLAPHFguI). LLMs and Symbolic World Models: While there is a lot of evidence that LLMs may not be directly encoding (symbolic) world models, it has also been known that we can learn such symbolic models from LLMs. Indeed, one of our earliest works on the role of LLMs in Planning was to extract symbolic planning models from them (c.f. https://t.co/v9ZLAi7IgO). There has been significant additional work since then--with some of it trying to combine sensory and linguistic data in learning world models. Verifiers and Simulators are related to World Models: A lot of the improvement in LLM reasoning models has come from post-training phase that uses LLMs as generators of plausible solutions, and checking their correctness with the verifiers or simulators that are available externally (c.f. https://t.co/NudO94Atzd). The critical importance of the availability of such verifiers/simulators for LLM post-training has become so clear that there is a clamor of the so-called "RL Environments"--which basically are RL engines coupled to verifiers or simulators standing in for the "environment." Acknowledging this connection would make "world model learning" as a general version of "verifier/Simulator learning". Learning from your experience vs. other's experience: One important distinction in world model learning is whether you are learning them by doing things in the world yourself and observing/feeling the consequences (which is pretty much what kids do), or whether you are trying to learn them from other people's collected experience (which is what most of the current post-LLM research on World Models does). The big difference tends to be causality.. when you generating your own experiences, you have the ability to do arbitrary causal intervention experiments, something that is hard when you are only learning from others' experience. The difficulty of gaining your own experience of course is that (a) it is time consuming and (b) possibly unsafe. Not surprisingly, notwithstanding Sutton's OAK proposal, most ongoing work on world models is based on the agent learning from others' experience. On the Irony of learning world models for synthetic worlds: A lot of the work on world models seems to be quixotically based on virtual worlds--such as video games. This seems quite ironic. Since these are made by us, the whole point of learning world models seems to be sort of "reverse engineering" what we (the humanity) already know. In this era of LLMs where everything that humanity knows is already fodder for training LLMs, what is the deeper reason as to why learning virtual worlds (rather than just stealing the program running the virtual world) is a legitimate long term research direction? (I am fine with playing with virtual worlds as a training wheel for the "real world" that we didn't engineer.. but am a little mystified by video games as the be all and end-all. Come to think of it, this irony is also present for the original Atari Game suite that pushed a lot of deep RL research: The game engine converts a compact RAM state to a video frame so the humans can "play" and the DRL algorithms try to reverse engineer the logic from this video frame.. Since the time of the Atari Games benchmark, any illusory need for such reverse engineering has largely disappeared, IMHO).

rao2z's tweet photo. World Models: The old, the new and the wishful #SundayHarangue

There is a lot of chatter about world models of late--even more than can be explained by Yann betting his entire new enterprise on it. I was going to comment on this clamor in my class this week, and thought I will preview it here first.. 😋

World Models are of course by no means new--whether learned or provided, they have been the backbone of decision making problems--be it control theory or #AI--for nearly a century. Russell & Norvig's Intro to AI text book *starts* with world model as an integral part of an agent architecture (see below).

A fortuitous by-product of the focus on world modes
is the crash course post-#alexnet #ML young'uns maybe getting to core #AI concepts: how hierarchical models of the world and mental simulation at differing abstractions help with long range planning..

Because the current world model craze has generally been ahistoric, it confounds multiple things, IMHO.

Resolution vs. Abstraction: Perhaps the most important is on their intended purpose.

Are they meant to "construct" believable synthetic worlds--thus requiring be CGI-level high fidelity

Or are they meant to help the agent to efficiently mentally simulate evolution of its world--conditioned on its own and other agent's actions--to support long range planning and decision making.

A large part of the current work on world models--especially that based purely on video and sensory data--seems to conflate it.

While it may seem that having a high fidelity world building model should also help in long range decision making, it is quite likely that the computational tradeoffs--between hi-res and abstraction tend to make them of questionable use for long range planning. Faster roll out (mental simulation) and higher resolution are quite often at loggerheads..

Disjunction and Abstraction: Having mentioned "hierarchy" and "abstraction" multiple times, I feel it is worth pointing out that at its core abstraction is a form of disjunction. An agent reasoning with the abstract models is basically reasoning over a disjunction of many distinct concrete futures--that are all roughly equivalent from the point of view of the goals of the agent. The connection to disjunction and abstraction is a powerful one that is not often acknowledged. An abstract action is a disjunction over concrete courses of action--thus leading to a disjunction of world states. A learned latent variable has similar disjunction semantics. For example, in a transformer-like architecture, a latent variable can be seen as a distribution over concrete tokens.

Role of language and Symbolic abstractions: While in theory it is possible to learn world models with hierarchical abstraction (e.g. with latent variable models), ignoring the linguistic data--which is after all the corner stone of human civilization--fails to leverage the abstractions we humans have developed over the millennia.

Planning, of the kind I am fond of, is possible because the models are at a significantly higher level of abstraction than pixels, or even any latent variable learned models can provide in the near future.

While the planning models of yore were written by humans, there is a way of avoiding that bottleneck. Our linguistic data already sort of captures of humanity's abstractions over video data--or what I like to call "space time signal tubes" (c.f. https://t.co/77YAXUX31y & https://t.co/sUDSOnXDhB ).

So, as much as I agree with the argument that language may not by itself lead to effective world models, I also equally believe that getting to the right level of abstraction from pixel stream data--while theoretically possible (in that we the humanity and evolution seem to have done it), is going to be awfully slow--especially when we have the human abstractions, however imperfect, are readily available in the language data. A powerful way, it seems to me, is to complement these symbolic and pixel level WMs..

The tradeoff is either "important parts only, but can do long range prediction" vs. "full resolution, but not long enough range". Humans seem to use language vs. visual priors for these two, which argues for an approach that uses both types of data in learning world models.

Internal Abstractions and Alignment Problem: Even if the efficiency is not an issue, another critical concern about learning purely from sensory data aligning the agents using those models to humans. There is no a priori reason that the abstractions learned internally from the sensory data by an agent would have any natural correspondence to those that humans use. To the extent we want artificial agents with learned world models to be easily aligned to us humans, taking the inductive biases present in the linguistic data seems like a smarter move (c.f. https://t.co/ebLAPHFguI).

LLMs and Symbolic World Models: While there is a lot of evidence that LLMs may not be directly encoding (symbolic) world models, it has also been known that we can learn such symbolic models from LLMs. Indeed, one of our earliest works on the role of LLMs in Planning was to extract symbolic planning models from them (c.f. https://t.co/v9ZLAi7IgO). There has been significant additional work since then--with some of it trying to combine sensory and linguistic data in learning world models.

Verifiers and Simulators are related to World Models: A lot of the improvement in LLM reasoning models has come from post-training phase that uses LLMs as generators of plausible solutions, and checking their correctness with the verifiers or simulators that are available externally (c.f. https://t.co/NudO94Atzd). The critical importance of the availability of such verifiers/simulators for LLM post-training has become so clear that there is a clamor of the so-called "RL Environments"--which basically are RL engines coupled to verifiers or simulators standing in for the "environment." Acknowledging this connection would make "world model learning" as a general version of "verifier/Simulator learning".

Learning from your experience vs. other's experience: One important distinction in world model learning is whether you are learning them by doing things in the world yourself and observing/feeling the consequences (which is pretty much what kids do), or whether you are trying to learn them from other people's collected experience (which is what most of the current post-LLM research on World Models does). The big difference tends to be causality.. when you generating your own experiences, you have the ability to do arbitrary causal intervention experiments, something that is hard when you are only learning from others' experience. The difficulty of gaining your own experience of course is that (a) it is time consuming and (b) possibly unsafe. Not surprisingly, notwithstanding Sutton's OAK proposal, most ongoing work on world models is based on the agent learning from others' experience.

On the Irony of learning world models for synthetic worlds: A lot of the work on world models seems to be quixotically based on virtual worlds--such as video games. This seems quite ironic. Since these are made by us, the whole point of learning world models seems to be sort of "reverse engineering" what we (the humanity) already know.

In this era of LLMs where everything that humanity knows is already fodder for training LLMs, what is the deeper reason as to why learning virtual worlds (rather than just stealing the program running the virtual world) is a legitimate long term research direction?

(I am fine with playing with virtual worlds as a training wheel for the "real world" that we didn't engineer.. but am a little mystified by video games as the be all and end-all. Come to think of it, this irony is also present for the original Atari Game suite that pushed a lot of deep RL research: The game engine converts a compact RAM state to a video frame so the humans can "play" and the DRL algorithms try to reverse engineer the logic from this video frame.. Since the time of the Atari Games benchmark, any illusory need for such reverse engineering has largely disappeared, IMHO).

114

118

18K

_curiousneuron retweeted

Micah G. Allen

@micahgallen

4 months ago

The hippocampus does more than just map space; it actively predicts future rewards by learning state transitions within the world. https://t.co/BUh47pUQBY

micahgallen's tweet photo. The hippocampus does more than just map space; it actively predicts future rewards by learning state transitions within the world.
https://t.co/BUh47pUQBY https://t.co/SBF1oFef1l

152

12K

_curiousneuron retweeted

Luiz Pessoa @PessoaBrain

4 months ago

𝗪𝗵𝗮𝘁'𝘀 𝘁𝗵𝗲 𝗿𝗲𝗹𝗮𝘁𝗶𝗼𝗻𝘀𝗵𝗶𝗽 𝗯𝗲𝘁𝘄𝗲𝗲𝗻 𝗺𝗮𝗻𝗶𝗳𝗼𝗹𝗱𝘀 𝗮𝗻𝗱 𝗿𝗲𝗰𝘂𝗿𝗿𝗲𝗻𝘁 𝗻𝗲𝘁𝘄𝗼𝗿𝗸𝘀 𝗶𝗻 𝘁𝗵𝗲 𝗯𝗿𝗮𝗶𝗻? This looks like a must read (suppl material bursting with goodies). https://t.co/aMNOr9ReQV

PessoaBrain's tweet photo. 𝗪𝗵𝗮𝘁'𝘀 𝘁𝗵𝗲 𝗿𝗲𝗹𝗮𝘁𝗶𝗼𝗻𝘀𝗵𝗶𝗽 𝗯𝗲𝘁𝘄𝗲𝗲𝗻 𝗺𝗮𝗻𝗶𝗳𝗼𝗹𝗱𝘀 𝗮𝗻𝗱 𝗿𝗲𝗰𝘂𝗿𝗿𝗲𝗻𝘁 𝗻𝗲𝘁𝘄𝗼𝗿𝗸𝘀 𝗶𝗻 𝘁𝗵𝗲 𝗯𝗿𝗮𝗶𝗻?
This looks like a must read (suppl material bursting with goodies).
https://t.co/aMNOr9ReQV https://t.co/2AE3EE7J5F

807

141

582

39K

_curiousneuron retweeted

Quanta Magazine

@QuantaMagazine

4 months ago

Sometimes, the only way to build back up is to let everything fall apart. This is certainly true at the cellular level. https://t.co/5o0yV5wypt

106

34K

_curiousneuron retweeted

Reads with Ravi

@readswithravi

4 months ago

This sentence by Dostoyevsky hits so hard. “You sensed that you should be following a different path, a more ambitious one, you felt that you were destined for other things but you had no idea how to achieve them and in your misery you began to hate everything around you.”

101

19K

602K

_curiousneuron retweeted

Pedro Domingos

@pmddomingos

4 months ago

Every subfield of AI has a corresponding field of science that it runs rings around: Machine learning: Statistics NLP: Linguistics Automated reasoning: Formal logic Knowledge representation: Philosophy Multiagent systems: Economics Computer vision: Sensor systems Robotics: Control systems

455

325

28K

_curiousneuron retweeted

Eli Sennesh @EliSennesh

4 months ago

Tired: the brain is a computer Also tired: no, it's a dynamical system Wired: neurons are bang-bang controllers

_curiousneuron retweeted

Ash Jogalekar

@curiouswavefn

4 months ago

I continue to believe that the nexus between neuroscience, thermodynamics and computation will be the most exciting one of the 21st century, informing multiple fields. https://t.co/gihohYPrwR

394

319

26K

_curiousneuron retweeted

Patrick Mineault

@patrickmineault

over 1 year ago

Excited to release what we’ve been working on at Amaranth Foundation, our latest whitepaper, NeuroAI for AI safety! A detailed, ambitious roadmap for how neuroscience research can help build safer AI systems while accelerating both virtual neuroscience and neurotech. 1/N

patrickmineault's tweet photo. Excited to release what we’ve been working on at Amaranth Foundation, our latest whitepaper, NeuroAI for AI safety! A detailed, ambitious roadmap for how neuroscience research can help build safer AI systems while accelerating both virtual neuroscience and neurotech. 1/N https://t.co/tPDn4hqMGQ

378

100

221

109K

Sheetal Jadhav

@_curiousneuron

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users