Guy Van den Broeck @guyvdb - Twitter Profile

guyvdb retweeted

Hacker News 20 @betterhn20

about 2 months ago

A Canonical Generalization of OBDD https://t.co/HaL2j51faK (https://t.co/ueOt9bOrZV)

0

2

4

278

guyvdb retweeted

Daniel Israel

@danielmisrael

16 days ago

Congratulations @itisalex3 on acceptance to #ACL2026 🎉 If you are interested in KV cache compression, I highly recommend reading this paper.

0

27

3

9

2K

guyvdb retweeted

Poorva Garg @PoorvaGarg11

23 days ago

Very excited to be leading this research direction, employing probabilistic programming to improve LLM inference for code generation.

1

44

4

33

7K

guyvdb retweeted

Daniel Israel

@danielmisrael

26 days ago

When you prompt an LLM for code, you get one deterministic program. However, the LLM actually defines a distribution over many programs, and existing methods discard it‼️ PPoT uses this distribution to extract free performance and efficiency gains. 🧵👇

1

66

18

41

11K

Who to follow

Zoubin Ghahramani

@ZoubinGhahrama1

VP Research, Google DeepMind, ex-head of Google Brain. Professor at University of Cambridge. Machine Learning Researcher. ex-Chief Scientist & VP of AI, Uber.

Brandon Amos

@brandondamos

🧙 RL @Reflection_AI past: @MetaAi @GoogleDeepmind @SCSatCMU @Cornell_Tech

Max Welling

@wellingmax

guyvdb retweeted

Konstantinos Kallas @KonsKallas

28 days ago

We just implemented trai (try + AI) (https://t.co/50VH8V9CgR), a Claude plugin that can help you isolate 🫷 changes done in your file system by tool calls (like pip install), and only commit them if they are intended ☺️. Try it out (pun intended) and share your feedback!

2

8

2

1

696

Guy Van den Broeck @guyvdb

3 months ago

@trunghlt @IanLi1118 new tokens depend on already unmasked tokens, but they do not depend on each other when unmasking multiple tokens in a single step

1

6

0

87

Guy Van den Broeck @guyvdb

3 months ago

We put probabilistic circuits into diffusion language models and got a big boost in reasoning performance!

Ian Li

@IanLi1118

3 months ago

One of the biggest promises of Diffusion LLMs is parallel generation: predicting multiple tokens at once to bypass the sequential bottleneck of autoregressive models. However, parallel generation comes with a price. For example: Should the sentence “He is from [MASK] [MASK]” be filled with [New] [York] or [San] [Diego]? If a diffusion model predicts both at the exact same time, it assumes independence and may produce... [San] [York]. 🤦‍♂️ We argue this arises from a structural misspecification: models are restricted to fully factorized outputs because parameterizing the full joint distribution would require a prohibitively massive output head. This is the Factorization Barrier crippling parallel generation. Here is how we broke it with CoDD.

IanLi1118's tweet photo. One of the biggest promises of Diffusion LLMs is parallel generation: predicting multiple tokens at once to bypass the sequential bottleneck of autoregressive models.

However, parallel generation comes with a price. For example:

Should the sentence “He is from [MASK] [MASK]” be filled with [New] [York] or [San] [Diego]?

If a diffusion model predicts both at the exact same time, it assumes independence and may produce... [San] [York]. 🤦‍♂️

We argue this arises from a structural misspecification: models are restricted to fully factorized outputs because parameterizing the full joint distribution would require a prohibitively massive output head.

This is the Factorization Barrier crippling parallel generation. Here is how we broke it with CoDD.

8

307

30

210

23K

0

12

0

4

1K

guyvdb retweeted

Zilei Shao @zileishao

3 months ago

Check out our most recent work on dLLM!

0

10

3

1K

guyvdb retweeted

Anji Liu @liu_anji

3 months ago

Check out our recent work offering a principled way to perform parallel prediction (few-step generation) in Diffusion LLMs with minimal performance degradation!

0

10

1

2

1K

guyvdb retweeted

Ian Li

@IanLi1118

3 months ago

One of the biggest promises of Diffusion LLMs is parallel generation: predicting multiple tokens at once to bypass the sequential bottleneck of autoregressive models. However, parallel generation comes with a price. For example: Should the sentence “He is from [MASK] [MASK]” be filled with [New] [York] or [San] [Diego]? If a diffusion model predicts both at the exact same time, it assumes independence and may produce... [San] [York]. 🤦‍♂️ We argue this arises from a structural misspecification: models are restricted to fully factorized outputs because parameterizing the full joint distribution would require a prohibitively massive output head. This is the Factorization Barrier crippling parallel generation. Here is how we broke it with CoDD.

8

307

30

210

23K

guyvdb retweeted

Zhihan Yang

@zhihanyang_

4 months ago

Join our reading group next Monday! Paper: Planned Diffusion Presenters: Daniel Israel (@danielmisrael), Tian Jin (@jintian)

0

17

4

5

3K

guyvdb retweeted

Discrete Diffusion Reading Group

@diffusion_llms

4 months ago

📢Feb 2 (Mon): Planned Diffusion 🙅Diffusion language models are capable of parallelizing text generation but can struggle with coherence in low time-step regimes. 💡Planned Diffusion unlocks a new axis of parallelism: Token-level parallelism ➡️ semantic parallelism ✍️Planned diffusion first generates a structured plan, then diffuses semantically independent spans of text in parallel according to the plan. This Monday, Daniel Israel (UCLA) (@danielmisrael) and Tian Jin (MIT) (@jintian) will discuss their exciting Planned Diffusion paper as joint first authors. Collaborators: Ellie Cheng (https://t.co/qdy0TAVHCD), Guy Van den Broeck (@guyvdb), Aditya Grover (@adityagrover_), Suvinay Subramanian (@suvinay), Michael Carbin (@mcarbin) Paper link: https://t.co/ptRpLFXHDh

diffusion_llms's tweet photo. 📢Feb 2 (Mon): Planned Diffusion

🙅Diffusion language models are capable of parallelizing text generation but can struggle with coherence in low time-step regimes.

💡Planned Diffusion unlocks a new axis of parallelism: Token-level parallelism ➡️ semantic parallelism

✍️Planned diffusion first generates a structured plan, then diffuses semantically independent spans of text in parallel according to the plan.

This Monday, Daniel Israel (UCLA) (@danielmisrael) and Tian Jin (MIT) (@jintian) will discuss their exciting Planned Diffusion paper as joint first authors.

Collaborators: Ellie Cheng (https://t.co/qdy0TAVHCD), Guy Van den Broeck (@guyvdb), Aditya Grover (@adityagrover_), Suvinay Subramanian (@suvinay), Michael Carbin (@mcarbin)

Paper link: https://t.co/ptRpLFXHDh

1

47

9

30

10K

guyvdb retweeted

Luis Lamb @luislamb

6 months ago

⁦@RealAAAI⁩ AAAI2021Conference - Neuro-Symbolic AI Panel, during the COVID-19 crisis. With ⁦@kerstingAIML⁩ ⁦@guyvdb⁩ ⁦@mattbotvinick⁩ Marta Kwiatkowska, Leslie Pack Kaelbling. Just a picture 🙂

luislamb's tweet photo. ⁦@RealAAAI⁩ AAAI2021Conference - Neuro-Symbolic AI Panel, during the COVID-19 crisis. With ⁦@kerstingAIML⁩ ⁦@guyvdb⁩ ⁦@mattbotvinick⁩ Marta Kwiatkowska, Leslie Pack Kaelbling. Just a picture 🙂 https://t.co/g49kUCzbpt

0

8

2

1

587

Guy Van den Broeck @guyvdb

6 months ago

@YuanqiD Thanks, you can find the paper here: https://t.co/DmiI6zGJuG

0

1

0

134

guyvdb retweeted

NeSy 2026 @nesyconf

7 months ago

Recordings of the NeSy 2025 keynotes are now available! 🎥 Check out insightful talks from @guyvdb , @tkipf and @dlmcguinness on our new Youtube channel. Topics include using symbolic reasoning for LLM, and object-centric representations https://t.co/iIHdKTr432

0

14

5

4

3K

Guy Van den Broeck @guyvdb

7 months ago

I gave a keynote at @nesyconf on "Symbolic Reasoning in the Age of Large Language Models" Check out the recording if you are curious about neurosymbolic generative AI: https://t.co/VUUJD4vdYB

0

42

7

18

4K

guyvdb retweeted

Alex Chen @itisalex3

8 months ago

What happens when we compress the KV cache of prompts with multiple instructions? 🤔 Existing compression methods can lead to some instructions being ignored. 🙀 We propose simple changes to KV cache eviction that fix this problem alongside other pitfalls to be aware of. 💯

itisalex3's tweet photo. What happens when we compress the KV cache of prompts with multiple instructions? 🤔

Existing compression methods can lead to some instructions being ignored. 🙀

We propose simple changes to KV cache eviction that fix this problem alongside other pitfalls to be aware of. 💯 https://t.co/qCywAYhD4o

2

25

3

12

5K

guyvdb retweeted

Tian Jin

@jintian

7 months ago

Plan autoregressively, denoise in parallel!

0

6

2

0

1K

guyvdb retweeted

Ellie Cheng

@ellieyhc

7 months ago

Diffusion 🤝 Autoregressive Fast high-quality generation

0

2

1

643

guyvdb retweeted

Daniel Israel

@danielmisrael

7 months ago

"An hour of planning can save you 10 hours of doing." ✨📝 Planned Diffusion 📝 ✨ makes a plan before parallel dLLM generation. Planned Diffusion runs 1.2-1.8× faster than autoregressive and an order of magnitude faster than diffusion, while staying within 0.9–5% AR quality.

7

315

47

217

39K

Guy Van den Broeck

@guyvdb

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users