Great paper by my students @TimZXiao and @johanneszenn and collaborators that applies ideas from Monte Carlo sampling to (black-box) LLM execution to turn LLMs into better calibrated stochastic samplers.
✨ New paper: Flipping Against All Odds
We found that large language models (LLMs) can describe probabilities—but fail to sample from them faithfully.
Yes, even flipping a fair coin is hard. 🪙
🧵 Here’s what we learned—and how we fixed it.
🔗https://t.co/Auw7agOws3
1/
✨ New paper: Flipping Against All Odds
We found that large language models (LLMs) can describe probabilities—but fail to sample from them faithfully.
Yes, even flipping a fair coin is hard. 🪙
🧵 Here’s what we learned—and how we fixed it.
🔗https://t.co/Auw7agOws3
1/
How could online learning apps adapt to learners and improve over time? Even if you're not a machine learning expert, @hanqizh's blog post on our last ICLR paper explains new approaches in simple terms (joint work with @alvorithm and @TheCharleyWu, supported by @TheresaAuthaler).
Differentiable annealed importance sampling (DAIS) minimizes the Jensen-Shannon divergence between its initial and target distribution!
Check out our work (https://t.co/Idrtk0Sp39) with @robamler and meet us at the poster at ICML in Hall C 4-9 #2610 in the morning session today!
🤔What about using an LLM as a function approximator for f(x; θ) where the parameters θ are natural language?
🤔Can we learn θ just like in machine learning (ML) where θ are numerical values?
✨Check out Verbalized ML, where data and models both operate in natural language! 🤩
Did you know that the training/test set split of the SVHN data set is biased, making SVHN unsuitable for evaluating generative models? Learn more from my students @TimZXiao and @johanneszenn at the DistShift workshop at #NeurIPS2023 tomorrow (10.30 am, room R06-R09).
🚨The training and test set of the Street View House Numbers (SVHN) dataset are NOT from the same distribution!🚨
Join us at the #NeurIPS2023 workshop on DistShift this Friday (10:30 am, room R06-R09) to find out more!
https://t.co/RTOVqQWHAm
w/ @johanneszenn@robamler
Training variational autoencoders on samples from a diffusion model essentially eliminates their known tendency to overfit the encoder without sacrificing model performance. Congrats to my PhD students @TimZXiao and @johanneszenn on their latest preprint!
https://t.co/pIjq9AndLM
If you're at ICLR, join my student @johanneszenn at the Tiny Paper poster session today from 1.15 to 3.15 pm in room MH4. You'll be surprised how many insights can fit in a 2-page paper!
https://t.co/l4tS7d8o7a
If you're in Kigali for ICLR this week, let's meet and chat over some drinks tomorrow at @TimZXiao's poster on rate/distortion theory of hierarchical VAEs. It's poster #106 in the MH rooms from 4:30 to 6:30.
https://t.co/TtdLPhpkXA
There is no need for gradients due to resampling in Differentiable Sequential Monte Carlo Samplers! Check out our recent work (https://t.co/zc6BnNYwCx) with @robamler and meet us at the poster on Friday!
My student @johanneszenn found a useful fact about differential sequential Monte Carlo samplers: you can ignore any gradients due to resampling because they vanish in expectation. Check out his accepted ICLR DEI paper and meet us at the poster on Friday. https://t.co/l4tS7d8o7a
The 2nd iteration of the "Neural Compression: From Information Theory to Applications" workshop will take place @icmlconf in Hawaii this year!
Submissions due May 27th. For more details:https://t.co/w5OI0jkoXl
@BerivanISIK@YiboYang@_dsevero@karen_ullrich@robamler @s_mandt
We just got the green light to hold another workshop on machine-learning-based data compression—this time at ICML. I'm very excited! Stay tuned for details and for the call for papers.
🎉Exciting news! Our "Neural Compression" workshop proposal has been accepted at #ICML 2023! Join us to explore the latest research developments, including perceptual losses and more compute-efficient models! @BerivanISIK, @YiboYang, @_dsevero, @karen_ullrich, @robamler
How do LLMs connect to modern computers in zero-shot problem solving abilities and histories?
Our latest blog post provides a fresh perspective on understanding LLMs and the prompting paradigm. Check it out!
https://t.co/PYf1Bm66Qn
@Besteuler@robamler#ChatGPT
Looking for the one VAE to rule them all? The bad news: it doesn't exist. The good news: our recently accepted ICLR 2023 paper shows how to optimally allocate information to each latent layer depending on your application: https://t.co/TZJdhEb5yh @TimZXiao@ml4science#ICLR2023
Sehr cool, dass das RHET AI morgen eine kostenlose Kinovorführung anbietet! Gezeigt wird "Ex Machina" im Arsenal Kino Tübingen morgen (Freitag) um 20:00. Und wer mag, kann danach für eine Nachbesprechung bleiben. Ich bin gespannt! https://t.co/WLNet9x233
2⃣ Wie viel Science steckt in der Fiction? Filmvorführung "Ex Machina" und Diskussion; mit Lukas Kohmann und Anne Burkhardt von RHET AI – und @robamler von @ml4science
Eintritt frei! Zur Anmeldung (und dem ganzen Programm): https://t.co/P7TjZ3NOS6
Informationsangebote gibt es viele (u.a. auch von @ml4science und @Cyber_Valley). Man muss sie aber auch nutzen, und dabei offen sein für Fakten und wissenschaftliche Erkenntnisse, und v.a. nicht auf Populisten mit nationalistischer/fremdenfeindlicher Rhetorik hereinfallen. 2/2
Ein super recherchierter Artikel! Ich bin begeistert, dass Zeitungen die gesellschaftliche Relevanz von IT-Themen erkennen. Schade nur, dass noch immer so viele Menschen ihrer Bürgerpflicht, sich über relevante Themen zu informieren, nicht nachkommen. 1/2 https://t.co/3ugaSVILh7