Check out our work "Memory Safe Computations with XLA compiler" at #NeurIPS2022 (with Yuze An, @dyedgreen, @markvanderwilk). The paper and PR can be found at https://t.co/AzTMGZRxIo and https://t.co/4IXt5hLFPo. The poster is https://t.co/fwrPx1LTOG. Some details in short [1/8]
@krzysztof_rus@PatrickKidger@ezyang I'm not sure how Enzyme is going to help here, even with MLIR support. User still needs an interface, in some form, to autodiff.
Check out our work "Memory Safe Computations with XLA compiler" at #NeurIPS2022 (with Yuze An, @dyedgreen, @markvanderwilk). The paper and PR can be found at https://t.co/AzTMGZRxIo and https://t.co/4IXt5hLFPo. The poster is https://t.co/fwrPx1LTOG. Some details in short [1/8]
We also applied eXLA to the language transformer model, and in the experiment we modified the sequence length which in turn controls the size of the self-attention block. Out of the box TF implementation fails with OOM with lengths more than 2k, and eXLA runs up to 7k. [7/8]
When working with a Gaussian process, have you ever wondered why Cholesky factorization failed, or a CG solve did not converge?
Answer: it's because you've got redundant, overlapping data points.
And that's just the starting point! On arXiv now!
https://t.co/L9gUFxk4HM
I am still welcoming PhD applicants for 2022 at Imperial College London. We are a growing research group, with clear goals on what new abilities we want to develop in ML and neural networks.
Topics: Invariances, neural arch search, (Bayesian) model selection, Gaussian processes.
We are organizing a small-scale, offline #NeurIPS2021 satellite event in Cambridge (UK) on the 8th of December. If you are interested in NeurIPS content and are in the neighborhood, this is your chance to connect with your local machine learning community https://t.co/FJ36rx8dlp
Join us to discuss Conjugate Gradient based GP approximations! We make training easier by automatically setting approximation parameters like CG tolerance using marginal likelihood bounds.
Today 5pm (London) / 9am PDT. Long talk and poster available at https://t.co/zaSS2NzKkX.
Current Conjugate Gradient Gaussian Processes require manual tuning to trade off accuracy and speed. Existing guidelines can give suboptimal results, without clear warnings.
Our method tunes automatically, runs fewer CG steps, and performs better: https://t.co/YtnnilK5FT 👇1/6
I'm looking forward to speaking tomorrow. I will share some thoughts on:
- How Gaussian processes can help deep learning
- Recent work on accurate GP inference
- What makes a method "exact", and to what extent recent methods live up to this
Link below if you want to join!
Tomorrow 10 Dec at 11am GMT I will speak at the Bayesian Deep Learning Meetup about **Bayesian Model Selection** and how it can help architecture search.
In a short 20 minutes we will discuss why we (Bayesians ∪ Deep Learners) should care, and approaches from now and the past.
Come and chat with the authors of our paper:
Doubly sparse variational gaussian processes!
https://t.co/8kcWz9Uv73
#AISTATS2020 @stefanos_ele @aptemav @NicolasDurrande @jameshensman @PROWLER_IO
We are more friendly than we look in the video ;)
My #ICML2020 tutorial videos on "Machine Learning with Signal Processing" are now freely available:
I: https://t.co/FmXlqrlZC5
II: https://t.co/Gxr1dZLzSb
III: https://t.co/NaWrZD3ZFY
IV: https://t.co/wJMUcpKsVJ
Slides: https://t.co/kTUp0oZDpo