ICML2024 Tutorial on
"Neural Operators
& Machine Learning on Function Spaces"
is now out. #NeuralOperators#AInScience@nvidia@NVIDIAAI
https://t.co/24cLebv7A3
Looking forward to building the future of this field with the bigger community, all together.
Today at NeurIPS (4:30-7:30pm), I will present our work on adaptive sampling for bridging the gap between physics informed models and numerical methods. It exploits many interesting links from orthogonal polynomials, quadrature rules and parametrization with neural networks.
#NeuralOperators learn physics through data.
We study long term prediction capability of #NeuralOperator on a hard task of ocean emulation with variable forcing, making me think very seriously about coupled weather ocean model, #THEModel
Excited to share our recently published paper in @WileyGlobal on "Ocean Emulation With Fourier Neural Operators: Double Gyre" https://t.co/MWXSZtbFBY
We used Fourier Neural Operators to build the first high-resolution weather model, FourCastNet. Since it works so well for atmospheric emulation a natural progression is to extend them to emulate ocean simulations.
We propose learning the dynamics of a simplified ocean simulation using Fourier neural operators. Fourier neural operators.
We are able to generate long forecasts using trained Fourier neural operators, and find that they are more accurate than using climatology or persistence on short-term forecasts and approach the accuracy of the physics-based model.
On long-term forecasts, the neural operators can still predict future scenarios with realistic physics like propagating waves and meandering currents. This is impressive because no physics is explicitly programmed into the neural operators. Physics is learned from data. @Azizzadenesheli
My @MLSysConf keynote is now online. https://t.co/VX6od4R8jA
The scaling of large language models has led to impressive gains in language understanding, but at a cost of insatiable memory and bandwidth requirements.
I advocated a principled approach of designing optimization and quantization algorithms that can reduce memory requirements without sacrificing accuracy.
This includes our research into gradient compression methods (GaLore, SignSGD) and logarithmic number system for representation.
We also design fine-grained memory reduction schemes such as KV cache compression, chunking and offloading to overcome memory bottlenecks in language models, especially in the reasoning mode where current memory requirements are massive.
Such principles are broadly applicable and especially relevant to physical AI where the memory and bandwidth requirements are even greater than frontier LLMs.
FALCON: built on centuries of knowledge from fluid dynamics, turbulent flows, control, RL, and ML, to deliver foundational approach for aviation, tested and trained on real world wing, validated on the real world.
This is very exciting, thanks @BBCBreaking for featuring our work.
AI+Science book is now up,
After years of work, our book on
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
is available at Foundations and Trends® in Machine Learning.
We cover the history of works in the intersection of AI and Science, to novel breakthroughs and advance in foundation of AI.
From neural networks to neural operators,
From PINNs to PINO and so on,
From computer vision to quantum, graph data, grid data, point cloud data, and GenAI in sciences,
From classical reduced order methods to learned ones.
Etc.
With an extraordinary list of co-authors
Xuan Zhang, Limei Wang, Jacob Helwig, Youzhi Luo, Cong Fu, Yaochen Xie, Meng Liu, Yuchao Lin, Zhao Xu, Keqiang Yan, Keir Adams, Maurice Weiler, Xiner Li, Tianfan Fu, Yucheng Wang, Alex Strasser, Haiyang Yu, YuQing Xie, Xiang Fu, Shenglong Xu, Yi Liu, Yuanqi Du, Alexandra Saxton, Hongyi Ling, Hannah Lawrence, Hannes Stärk, Shurui Gui, Carl Edwards, Nicholas Gao, Adriana Ladera, Tailin Wu, Elyssa F. Hofgard, Aria Mansouri Tehrani, Rui Wang, Ameya Daigavane, Montgomery Bohde, Jerry Kurtin, Qian Huang, Tuong Phung, @MinkaiX , Chaitanya K. Joshi, Simon V. Mathis, Kamyar Azizzadenesheli, Ada Fang, Alán Aspuru-Guzik, Erik Bekkers, Michael Bronstein, Marinka Zitnik, @AnimaAnandkumar , @StefanoErmon , Pietro Liò, @yuqirose , Stephan Günnemann, @jure, Heng Ji, Jimeng Sun, Regina Barzilay, Tommi Jaakkola, Connor W. Coley, Xiaoning Qian, Xiaofeng Qian, Tess Smidt, @ShuiwangJi
Our 500+ page AI4Science paper is finally published:
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems. Foundations and Trends® in Machine Learning, Vol. 18, No. 4, 385–912, 2025
https://t.co/RzxYTDOJwx
"NeuralDMD"—a fully interpretable neural framework that fuses neural implicit fields with Dynamic Mode Decomposition to recover spatio-temporal dynamics from very sparse and noisy data.
From black-hole imaging to weather nowcasting
Jw. Ali SaraerToosi, @tu_renbo, and Aviad Levis @UofT@nvidia
Paper: https://t.co/9AfCiy2CYQ
Website: https://t.co/ywVNuHtagU
Code: https://t.co/CcA811uhXJ
Translating the successful techniques, tricks, and developments in neural networks to neural operators to learn on any resolution.
We carefully planned out a practical and easy way to translate what we know works in neural networks and bring them all to neural operators, even go beyond and invent more.
Paper: https://t.co/Do9yz33Goi
A joint work with @julberner@mliuschi@JeanKossaifi
Valentin Duruisseaux, Boris Bonev, and
@AnimaAnandkumar
The neural operator library to read through:
https://t.co/xNpAGfpmUW
Neural Operators – Deep learning at any resolution
Extending neural networks to function spaces: While many phenomena are inherently described by functions, neural networks define vector-to-vector mappings that rely on fixed discretization of the input and output.
Neural Operators instead define learnable function-to-function mappings that guarantee consistent predictions across different discretization of the input and output functions. By respecting the functional nature of the data, neural operators can achieve improved performance and generalization.
Translating the success of deep learning to operator learning: Careful engineering of neural architectures has been a key factor in deep learning’s success. Translating these architectures to neural operators is crucial for operator learning to enjoy the same empirical optimizations.
Key principles for constructing Neural Operators:
*Recipes for converting popular architectures (CNNs, GNNs, transformers, etc.) into Neural Operators
*Guidance for practitioners
https://t.co/hrlkddRrpa
https://t.co/U8oqnHRcVY
@julberner@mliuschi@JeanKossaifi Valentin Duruisseaux Boris Bonev @Azizzadenesheli@caltech
The best undergraduate and graduate research awards at @Caltech commencement this year were presented to research on Neural Operators by my students @ZongyiLiCaltech and Miguel Liu-Schiaffini.
This recognition is a testament to their passion and dedication, and the impact their work has already had: from building a foundation for AI+Science through multi-scale learning, creating the first high-resolution AI-based weather model, designing a novel medical catheter that reduces bacterial contamination, to name just a few.
The future of Physical AI will be built on the foundation of Neural Operators since the real world exists in multiple scales and resolutions, while standard deep learning only supports fixed resolution. This is essential for us to unlock the full potential of AI for scientific discovery.
#Caltech2025 #ai #science #NeuralOperators
Stable-ChebNet
Stable long range dependency is essential for physics, social network, and science,
Technically, information needs to traverse effectively through out deep learning models without dissipating by due to lack of design.
We propose a fundamentally non-dissipative model with controlled stable information propagation, with strong empirical benefits on all the long range dependency benchmarks.
A great work led by @haririAli95, our rising star in ML and graph learning.
📢ChebNet is back—with long-range abilities on graphs !🎉
We revive ChebNet for long-range tasks, uncover instability in polynomial filters, and propose Stable‑ChebNet—a non-dissipative dynamical system with controlled, stable info propagation 🚀
📄: https://t.co/oOjJ6YHiEc
We developed the foundations of diffusion model and posterior sampling on function spaces.
We carefully extend Tweedie principle to function spaces which sits at the core of our methods.
We provide techniques for functional regressions/reconstruction given partial sample observations.
This way, we establish a few of the initial steps towards functional regression and data assimilation in science and engineering.
Excited to introduce our latest work, Guided Diffusion Sampling on Function Spaces (FunDPS) (https://t.co/O4KUsh62qb) - a discretization-agnostic generative framework for solving PDE-based forward and inverse problems.
Diffusion-based posterior sampling on function spaces: Our model recovers full-field PDE solutions, coefficient functions, and boundary conditions from severely sparse (just 3%) measurements, yielding SotA performance in both speed and accuracy.
Multi-resolution operator learning pipeline: FunDPS leverages Gaussian Random Field priors and neural operator architectures, enabling multi-resolution training and inference, reducing training time by 25% and inference time by 50%.
Infinite-dimensional Tweedie’s Formula: We extend Tweedie’s formula into infinite-dimensional Banach spaces, forming the rigorous theoretical foundation for posterior mean estimation.
Results: Achieved an average 32% accuracy improvement and 4x fewer sampling steps compared to previous SOTA approaches across five challenging PDE tasks. Plus, our multi-resolution inference pipeline accelerates computations by up to 25x!
Paper (https://t.co/O4KUsh62qb). Code (https://t.co/WEspyD8DTJ), based on our earlier workshop paper (https://t.co/RLFGKe9m0P).
@jiacheny7, @AbbasMammadov11, @julberner, @gavinkerrigan, Jong Chul Ye, @Azizzadenesheli
#DiffusionModels #InverseProblems #PDE #MachineLearning #NeuralOperators #AI4Science
@orvieto_antonio@gowerrobert Indeed, great series of ideas, shedding lights to questions that have been standing for a long time, thanks for the great work.
This work has been keeping me awake for quite sometime,
The secret sauce behind Adam is its Signum. This is just mind consuming, great work @orvieto_antonio and @gowerrobert.
It is also a great news for efficiency and MLSys communities.
Adam is similar to many algorithms, but cannot be effectively replaced by any simpler variant in LMs.
The community is starting to get the recipe right, but what is the secret sauce?
@gowerrobert and I found that it has to do with the beta parameters and variational inference.
https://t.co/IkjM3oVOVJ