Prof. Anima Anandkumar

@AnimaAnandkumar

AI+Science, Bren Professor @caltech, Time100, Fmr Sr Director of #AI research @nvidia Fmr Principal Scientist @awscloud

Joined May 2021

2.4K Following

40.3K Followers

2.6K Posts

AnimaAnandkumar retweeted

Caltech LCSSP @CaltechLCSSP

about 14 hours ago

🎉Congrats to @CaltechLCSSP's @rmichaelalvarez & @RKocielnik, @pengrui_han, Peiyang Song, Myrl Marmarelis, Ramit Debnath, & @Caltech's Dean Mobbs & @AnimaAnandkumar on their @icmlconf 2026 paper becoming an oral presentation! https://t.co/T6xDIVhCih

CaltechLCSSP's tweet photo. 🎉Congrats to @CaltechLCSSP's @rmichaelalvarez
& @RKocielnik, @pengrui_han, Peiyang Song, Myrl Marmarelis, Ramit Debnath, & @Caltech's Dean Mobbs & @AnimaAnandkumar on their @icmlconf 2026 paper becoming an oral presentation!
https://t.co/T6xDIVhCih https://t.co/O35EtGSlV5

Prof. Anima Anandkumar

@AnimaAnandkumar

1 day ago

Our paper, “Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior,” has been selected for Oral Presentation at CTB @icmlconf * Paper: https://t.co/wT8OQmOs9w * Website: https://t.co/8idwkMM76N * Code: https://t.co/8pfdiyl71d A central question in AI evaluation is whether we can use low-cost self-report probes to anticipate how LLMs will actually behave in tasks. In our earlier work, “The Personality Illusion,” we found that LLMs can give coherent personality self-reports that do not reliably predict behavior. This paper asks a follow-up question: When do self-reports actually track behavior, and what are the failure modes where they don't? Across 11 LLMs, 4 behavioral tasks, and a 2 × 2 × 2 experimental design, we find that self-report–behavior coherence exists, but it is selective: 1) The instrument matters. Broad Big Five personality traits do not predict task behavior well. But a more behavior-specific framework, the Theory of Planned Behavior, can recover much stronger coherence under favorable conditions. 2) Context matters. When self-reports and behavior happen in the same conversation, coherence can reach human-level intention–behavior baselines. But when they happen in separate conversations, coherence often collapses. 3) The task matters. Coherence survives better for behaviors anchored outside the immediate prompt, such as implicit bias and aspects of honesty. It collapses for behaviors strongly shaped by the local context, such as sycophancy. 4) Personas are not a fix. Persona prompting makes models’ self-reports more stable across conversations, but it does not reliably bring behavior into alignment. This is especially important for persona-customized AI systems: changing what a model says about itself does not necessarily change what it does. The takeaway: LLM self-reports should not be treated as context-free behavioral diagnostics. If we want to use psychometric probes for AI safety, deployment, or model evaluation, we need task-specific instruments, behaviorally grounded validation, and careful separation between what a model says and what it actually does. Huge thanks to my co-authors @RKocielnik Pengrui Han, Peiyang Song, Myrl G. Marmarelis, Ramit Debnath, Dean Mobbs, and R. Michael Alvarez, and to the @Caltech Linde Center for Science, Society, and Policy @CaltechLCSSP

AnimaAnandkumar's tweet photo. Our paper, “Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior,” has been selected for Oral Presentation at CTB @icmlconf

* Paper: https://t.co/wT8OQmOs9w
* Website: https://t.co/8idwkMM76N * Code: https://t.co/8pfdiyl71d

A central question in AI evaluation is whether we can use low-cost self-report probes to anticipate how LLMs will actually behave in tasks.

In our earlier work, “The Personality Illusion,” we found that LLMs can give coherent personality self-reports that do not reliably predict behavior. This paper asks a follow-up question: When do self-reports actually track behavior, and what are the failure modes where they don't?

Across 11 LLMs, 4 behavioral tasks, and a 2 × 2 × 2 experimental design, we find that self-report–behavior coherence exists, but it is selective:

1) The instrument matters. Broad Big Five personality traits do not predict task behavior well. But a more behavior-specific framework, the Theory of Planned Behavior, can recover much stronger coherence under favorable conditions.

2) Context matters. When self-reports and behavior happen in the same conversation, coherence can reach human-level intention–behavior baselines. But when they happen in separate conversations, coherence often collapses.

3) The task matters. Coherence survives better for behaviors anchored outside the immediate prompt, such as implicit bias and aspects of honesty. It collapses for behaviors strongly shaped by the local context, such as sycophancy.

4) Personas are not a fix. Persona prompting makes models’ self-reports more stable across conversations, but it does not reliably bring behavior into alignment. This is especially important for persona-customized AI systems: changing what a model says about itself does not necessarily change what it does.

The takeaway: LLM self-reports should not be treated as context-free behavioral diagnostics. If we want to use psychometric probes for AI safety, deployment, or model evaluation, we need task-specific instruments, behaviorally grounded validation, and careful separation between what a model says and what it actually does.

Huge thanks to my co-authors @RKocielnik Pengrui Han, Peiyang Song, Myrl G. Marmarelis, Ramit Debnath, Dean Mobbs, and R. Michael Alvarez, and to the @Caltech Linde Center for Science, Society, and Policy @CaltechLCSSP

Prof. Anima Anandkumar

@AnimaAnandkumar

8 days ago

Check out our work on end-to-end ultrasound using neural operator for lung aeration https://t.co/CV3Qnh3qCk We directly reconstructs lung aeration maps from RF data, bypassing the need for traditional beamformers and indirect interpretation of B-mode images.

Midjourney

@midjourney

8 days ago

A technical dive inside our new "Midjourney Scanner"

28K

12K

11M

Prof. Anima Anandkumar

@AnimaAnandkumar

16 days ago

This is something I have been emphasizing since we started our work on Neural Operators. We very quickly went from simple fluid dynamics benchmarks to hard problems like building the first high-resolution AI-weather model, FourCastNet, and modeling turbulence in nuclear fusion. For those applications, we got speedup of 10,000 - million times. Simple benchmarks are great to test new architecture/algorithms work, but not the end.

Yijing Zhang @YijingZ91217

17 days ago

Neural PDE solvers have seen exciting progress! 🌊 But despite growing adoption, we still don’t know 𝘄𝗵𝗲𝗻 we should use them instead of classical solvers. 🤔 Our new paper has a surprising finding: the harder the PDE task, the more cost-effective learned solvers become. 🧵👇

YijingZ91217's tweet photo. Neural PDE solvers have seen exciting progress! 🌊
But despite growing adoption, we still don’t know 𝘄𝗵𝗲𝗻 we should use them instead of classical solvers. 🤔
Our new paper has a surprising finding: the harder the PDE task, the more cost-effective learned solvers become. 🧵👇 https://t.co/nUXv1FXZYU

31K

141

114

20K

Who to follow

Jürgen Schmidhuber

@SchmidhuberAI

OG of: P and T in ChatGPT, 100x deeper learning, meta learning and RSI, neural distillation, GAN/World Model... Co-authored most-cited AI paper of 20th century

Yi Ma

@YiMaTweets

Chair Professor in AI, Hong Kong University. A Mathematical Theory of Intelligence/Memory: https://t.co/leZlkURb7j

Zoubin Ghahramani

@ZoubinGhahrama1

VP Research, Google DeepMind, ex-head of Google Brain. Professor at University of Cambridge. Machine Learning Researcher. ex-Chief Scientist & VP of AI, Uber.

Prof. Anima Anandkumar

@AnimaAnandkumar

23 days ago

Nice work studying zero shot super resolution in neural operators.

Statistics (Machine Learning) Papers @StatsPapers

24 days ago

Is Zero-Shot Super-Resolution Possible in Operator Learning? Unique Subedi, Ambuj Tewari https://t.co/3FuVJxia3Y [𝚜𝚝𝚊𝚝.𝙼𝙻 𝚌𝚜.𝙻𝙶 𝚖𝚊𝚝𝚑.𝙰𝙿]

StatsPapers's tweet photo. Is Zero-Shot Super-Resolution Possible in Operator Learning?

Unique Subedi, Ambuj Tewari
https://t.co/3FuVJxia3Y [𝚜𝚝𝚊𝚝.𝙼𝙻 𝚌𝚜.𝙻𝙶 𝚖𝚊𝚝𝚑.𝙰𝙿] https://t.co/4za3y5cy4F

Prof. Anima Anandkumar

@AnimaAnandkumar

28 days ago

Great to see extrapolation success with FNOs.

PRX Quantum @PRX_Quantum

29 days ago

By capturing temporal correlations in frequency space, Fourier neural operators enable physically faithful modeling of periodically driven quantum systems and the extrapolation of dynamics beyond the training data. Read more: https://t.co/NiNphCB4fu

PRX_Quantum's tweet photo. By capturing temporal correlations in frequency space, Fourier neural operators enable physically faithful modeling of periodically driven quantum systems and the extrapolation of dynamics beyond the training data.

Read more: https://t.co/NiNphCB4fu https://t.co/WOnibdFZlX

Prof. Anima Anandkumar

@AnimaAnandkumar

about 1 month ago

I am thrilled that my article in @americanacad Daedalus special issue on AI & Science: What Is the Future of Discovery? edited by James Manyika. https://t.co/vvur95HXGI I talk about : How Do We Build AI to Push the Frontiers of Scientific Discovery? Scientific progress is limited not by a lack of new ideas but by the time and cost involved in physical experimentation. Scientific discovery is a needle in the haystack problem: it does not help if AI gives you a vastly bigger haystack. Without knowing if any of the ideas work, an AI system that designs experiments just increases the effort required, since performing the experiments to validate the ideas is the real bottleneck. In my view, AI’s most transformative impact in enabling scientific discoveries lies in reducing the need for such experiments. To get there, we need to build AI models that are able to granularly simulate and understand physics at all scales, rather than just abstractly reason in the textual domain. I explore what methods like Neural Operators have already helped achieve, what still needs to be done, and what lies ahead.

AnimaAnandkumar retweeted

Bahareh Tolooshams

@BTolooshams

about 1 month ago

We introduce Sparse Autoencoder Neural Operators (SAE-NOs), a functional framework for representation learning and mechanistic interpretability that treats data as samples from underlying continuous functions and learns mappings between function spaces. Standard SAEs (SAE-MLP) represent each concept with a scalar activation and a vector-valued dictionary atom, limiting their ability to capture how and where a concept is expressed across structured domains. SAE-FNO introduces feature-map representations with both concept sparsity and domain sparsity, allowing the model to capture not only which concepts are active, but also where and how they are expressed across the domain. This is a joint collaboration, between @UAlberta/@AmiiThinks and @Caltech, with Ailsa Shen and @AnimaAnandkumar. 1/ arXiv: https://t.co/EphbL2FJYA

BTolooshams's tweet photo. We introduce Sparse Autoencoder Neural Operators (SAE-NOs), a functional framework for representation learning and mechanistic interpretability that treats data as samples from underlying continuous functions and learns mappings between function spaces.

Standard SAEs (SAE-MLP) represent each concept with a scalar activation and a vector-valued dictionary atom, limiting their ability to capture how and where a concept is expressed across structured domains.

SAE-FNO introduces feature-map representations with both concept sparsity and domain sparsity, allowing the model to capture not only which concepts are active, but also where and how they are expressed across the domain.

This is a joint collaboration, between @UAlberta/@AmiiThinks and @Caltech, with Ailsa Shen and @AnimaAnandkumar. 1/

arXiv: https://t.co/EphbL2FJYA

374

290

23K

AnimaAnandkumar retweeted

Robert Joseph

@Robertljg

about 2 months ago

Very excited to finally release TorchLean publicly! I also wrote a longer blog on why I think this matters: https://t.co/PiwjU5smHq Thread below :)

Prof. Anima Anandkumar

@AnimaAnandkumar

about 2 months ago

TorchLean codebase is now available! TorchLean is a Lean 4 framework for verified neural-network software. It supports typed tensors, runnable training, graph IRs, verified autograd, Float32/IEEE semantics, CROWN / IBP-style verification, certificate checking, PyTorch interop, and CUDA/GPU execution. After feedback and comments on our original post, we expanded TorchLean substantially: neural operators/FNOs, diffusion models, GPT-style text models, GPT-2-style runs, Mamba/state-space models, RL, 3D vision certificates, Bug Zoo case studies, PyTorch interop, and more. Project page: https://t.co/RZjTQQSGw8 Codebase: https://t.co/NfPQVz9kdu @Robertljg, Jennifer Cruden, Will Adkisson, Xiangru Zhong, @huan_zhang12 @caltech #MachineLearning #ScientificComputing #Lean #FormalVerification

Prof. Anima Anandkumar

@AnimaAnandkumar

4 months ago

We’re excited to release TorchLean which is the first fully verified neural network framework in Lean. The Lean community has largely focused on pure mathematics. TorchLean expands this frontier toward verified neural network software and scientific computing. With the recent release of CSlib, we see this as another step toward a fully verified ML stack. We support features: 1. Executable IEEE-754 floating-point semantics (and extensible alternative FP models) verified tensor abstractions with precise shape/indexing semantics 2. Formally verified autograd system for differentiation of NN programs Proof-checked certification / verification algorithms like CROWN (robustness, bounds, etc.) 3. PyTorch-inspired modeling API with eager-style development + export/lowering to a shared IR for execution and verification Project page: https://t.co/YHpqhRbMQe Paper: [2602.22631] TorchLean: Formalizing Neural Networks in Lean Work done @Robertljg, Jennifer Cruden, Xiangru Zhong, @huan_zhang12 and @AnimaAnandkumar. #MachineLearning #ScientificComputing #Lean

AnimaAnandkumar's tweet photo. We’re excited to release TorchLean which is the first fully verified neural network framework in Lean. The Lean community has largely focused on pure mathematics. TorchLean expands this frontier toward verified neural network software and scientific computing. With the recent release of CSlib, we see this as another step toward a fully verified ML stack.

We support features:
1. Executable IEEE-754 floating-point semantics (and extensible alternative FP models) verified tensor abstractions with precise shape/indexing semantics
2. Formally verified autograd system for differentiation of NN programs Proof-checked certification / verification algorithms like CROWN (robustness, bounds, etc.)
3. PyTorch-inspired modeling API with eager-style development + export/lowering to a shared IR for execution and verification

Project page: https://t.co/YHpqhRbMQe
Paper: [2602.22631] TorchLean: Formalizing Neural Networks in Lean
Work done @Robertljg, Jennifer Cruden, Xiangru Zhong, @huan_zhang12 and @AnimaAnandkumar.

#MachineLearning #ScientificComputing #Lean

246

944

149K

15K

AnimaAnandkumar retweeted

Becca Willett @WillettBecca

about 2 months ago

Nominations are now open for the Pritzker Prize for AI in Science Research Excellence! This prize honors outstanding researchers advancing both AI and the natural sciences or engineering. Nominate someone today! 🔗 https://t.co/69Rf7HuHic

AnimaAnandkumar retweeted

Jorge Bravo Abad

@bravo_abad

about 2 months ago

Accurate and scalable deep Maxwell solvers Maxwell's equations are the bedrock of photonic device design, from metalenses to chip-scale wavelength multiplexers. Solving them over realistic device sizes (hundreds of wavelengths, with subwavelength dielectric features) is computationally brutal. Neural network surrogates have been promising on toy problems but rarely scale: fixed domain sizes, narrow parameter ranges, no general boundary conditions, accuracy that degrades as the problem grows. Chenkai Mao and Jonathan Fan at Stanford propose a different recipe. Instead of training a network to solve the full problem, they train a neural operator on subdomains and plug it into classical iterative methods. The subdomain network is a modified Fourier neural operator that takes arbitrary Robin-type boundary conditions as inputs, used as a flexible preconditioner inside F-GMRES. It gives bounded-accuracy subdomain solutions, and reaches double precision at inference despite single-precision training. The interesting move is at the global scale. They wrap the subdomain solver in an overlapping Schwarz domain decomposition loop, and use the same network to cheaply solve the subdomain eigenvalue problems that build a coarse space for two-level Schwarz. That coarse correction gives near-optimal scaling, where iteration counts stay roughly constant as the global problem grows. A single network handles different sizes, resolutions, wavelengths and dielectric distributions, with 20 to 50x fewer iterations than CPU GMRES or BiCGSTAB. They benchmark up to ~3000x3000 grids and 200 wavelengths, then plug the solver into adjoint-based optimization to inverse-design freeform devices: a wavelength division multiplexer, a near-infrared metalens, and a volumetric coupler. Trajectories track ground-truth FDFD almost exactly. For photonics, semiconductors and optical communications, this makes neural surrogates operationally useful for real device design. Training only a subdomain model and letting iterative methods handle global scaling is a reusable pattern across PDE problems in heat transfer, acoustics and mechanics. Paper: Mao & Fan, Proc. of the National Academy of Sciences (2026) | journal license https://t.co/WxTJwVPUcn

bravo_abad's tweet photo. Accurate and scalable deep Maxwell solvers

Maxwell's equations are the bedrock of photonic device design, from metalenses to chip-scale wavelength multiplexers. Solving them over realistic device sizes (hundreds of wavelengths, with subwavelength dielectric features) is computationally brutal. Neural network surrogates have been promising on toy problems but rarely scale: fixed domain sizes, narrow parameter ranges, no general boundary conditions, accuracy that degrades as the problem grows.

Chenkai Mao and Jonathan Fan at Stanford propose a different recipe. Instead of training a network to solve the full problem, they train a neural operator on subdomains and plug it into classical iterative methods. The subdomain network is a modified Fourier neural operator that takes arbitrary Robin-type boundary conditions as inputs, used as a flexible preconditioner inside F-GMRES. It gives bounded-accuracy subdomain solutions, and reaches double precision at inference despite single-precision training.

The interesting move is at the global scale. They wrap the subdomain solver in an overlapping Schwarz domain decomposition loop, and use the same network to cheaply solve the subdomain eigenvalue problems that build a coarse space for two-level Schwarz. That coarse correction gives near-optimal scaling, where iteration counts stay roughly constant as the global problem grows.

A single network handles different sizes, resolutions, wavelengths and dielectric distributions, with 20 to 50x fewer iterations than CPU GMRES or BiCGSTAB. They benchmark up to ~3000x3000 grids and 200 wavelengths, then plug the solver into adjoint-based optimization to inverse-design freeform devices: a wavelength division multiplexer, a near-infrared metalens, and a volumetric coupler. Trajectories track ground-truth FDFD almost exactly.

For photonics, semiconductors and optical communications, this makes neural surrogates operationally useful for real device design. Training only a subdomain model and letting iterative methods handle global scaling is a reusable pattern across PDE problems in heat transfer, acoustics and mechanics.

Paper: Mao & Fan, Proc. of the National Academy of Sciences (2026) | journal license https://t.co/WxTJwVPUcn

AnimaAnandkumar retweeted

Pengrui Han (Barry)

@pengrui_han

about 2 months ago

Excited to share that The Personality Illusion has been accepted to ICML 2026 🥂 We show that LLMs' self-reported personalities are systematically dissociated from their actual behavior ：） Huge thanks to my amazing collaborators and advisors! @RKocielnik @p_song1 Ramit Debnath, Dean Mobbs @AnimaAnandkumar @rmichaelalvarez #ICML #Caltech #LLM

AnimaAnandkumar retweeted

Zongyi Li @zongyili_nyu

about 2 months ago

Geometric operator learning is challenging because high-quality simulations on complex geometries are expensive. In GeoPT, we pretrain on low-cost graphics datasets augmented with simple dynamics, showing promising scaling behavior.

Prof. Anima Anandkumar

@AnimaAnandkumar

2 months ago

Go FNO!

NVIDIA HPC Developer

@NVIDIAHPCDev

2 months ago

⚛️ Explore how AI physics can accelerate clean, modular nuclear reactor design. By leveraging NVIDIA CUDA-X libraries, PhysicsNeMo, and Omniverse libraries, see how nuclear developers address these challenges with GPU-accelerated digital twins. https://t.co/nC0ou965mW

108

10K

AnimaAnandkumar retweeted

Mathelirium

@mathelirium

3 months ago

Physics-Informed Neural Operators: Learning The Solver, Not Just One Solution Our PINN scene learned one solution field for one PDE setup. A Physics-Informed Neural Operator learns the map from input fields, like material coefficients or source terms, to the full solution across a whole family of PDE problems. So, the goal is no longer just one approximate answer, but a reusable solver-shaped object guided by the physics itself.

487

266

26K

Prof. Anima Anandkumar

@AnimaAnandkumar

3 months ago

Thank you for supporting our work @patrickshafto

AnimaAnandkumar retweeted

UN Scientific Advisory Board @ScienceBoard_UN

3 months ago

🔬 Weekly Science Long Read 🌍 🤖 @Caltech with @AnimaAnandkumar, new @ScienceBoard_UN member: AI can model weather, climate, food, and disease. For more, read the Board's Brief on Verification of Frontier AI. 🌐 Article: https://t.co/jYr8ryIAjO 📘 Brief: https://t.co/Fsl16g5e1Z

Prof. Anima Anandkumar

@AnimaAnandkumar

3 months ago

My interview with @Caltech news after being selected for UN scientific advisory board https://t.co/eBVnWPp95E

121

AnimaAnandkumar retweeted

UN Scientific Advisory Board @ScienceBoard_UN

4 months ago

✨ Introducing the members of @ScienceBoard_UN! 🌍 @AnimaAnandkumar is the Bren Professor at @Caltech, previously at @nvidia and @awscloud. She is a leading voice on artificial intelligence, machine learning, and AI-for-science. 📣 Read more: https://t.co/eSInS0aCqM

Prof. Anima Anandkumar

@AnimaAnandkumar

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users