Ke Wang @wangkeml - Twitter Profile

15 days ago

This was a great collaboration with @ShuangqiLi, @ MathieuSalzmann, @pafrossard 🙌 💻 Code: https://t.co/eht4nNiXng 📖 Paper: https://t.co/jBG9HpxzJt 📜 9/9

0

5

1

108

Ke Wang @wangkeml

15 days ago

SFT's token-by-token imitation can overfit to fixed demonstrations 📑, raising a key question: should every token be trusted equally? We introduce PriFT ✏️, an SFT framework using a frozen pretrained model 🧊 to guide token reweighting and align SFT with prior knowledge. 📜 1/9

1

8

6

2

493

Ke Wang @wangkeml

15 days ago

PriFT also provides a better initialization for subsequent RL training 🏋️, leading to stronger RL results than SFT or DFT initialization. 📜 8/9

wangkeml's tweet photo. PriFT also provides a better initialization for subsequent RL training 🏋️, leading to stronger RL results than SFT or DFT initialization.

📜 8/9 https://t.co/eUHG2k6iPN

1

4

0

1

74

wangkeml retweeted

Alessandro Favero @alesfav

about 1 month ago

AI needs vastly more data than we do. One idea might close the gap: don't predict raw signals (tokens), predict your own abstract latent representation (JEPA, data2vec). With @DanKorchinski @MatthieuWyart, on a toy model, we prove how much that helps: the gap is exponential. 🧵

alesfav's tweet photo. AI needs vastly more data than we do. One idea might close the gap: don't predict raw signals (tokens), predict your own abstract latent representation (JEPA, data2vec).

With @DanKorchinski @MatthieuWyart, on a toy model, we prove how much that helps: the gap is exponential.

🧵 https://t.co/I51Q6Jwiqr

14

528

80

474

53K

wangkeml retweeted

Andrea Miele @andreamiele_

about 1 month ago

🔥 New paper: Fixed-Point Masked Generative Modeling Masked generative models are becoming a very exciting alternative to autoregressive generation, especially for language. They decode in parallel, but every denoising step still runs a full bidirectional Transformer. We make them cheaper and stronger with fixed-point denoisers 🧵 w/ @qinym710 @AlbaCbCs @jdeschena and @pafrossard (1/12)

andreamiele_'s tweet photo. 🔥 New paper: Fixed-Point Masked Generative Modeling

Masked generative models are becoming a very exciting alternative to autoregressive generation, especially for language.

They decode in parallel, but every denoising step still runs a full bidirectional Transformer.

We make them cheaper and stronger with fixed-point denoisers 🧵

w/ @qinym710 @AlbaCbCs @jdeschena and @pafrossard
(1/12)

5

144

27

100

19K

wangkeml retweeted

Yiming Qin @qinym710

about 1 month ago

Welcome to check CoFRe 🪎 - a complete training-to-inference framework for fixed point masked generation ! Improved quality v.s. cost tradeoff for both text and visual data. Amazing work leaded by @andreamiele_

0

4

2

3

263

wangkeml retweeted

Olga Zaghen

@olgazaghen

8 months ago

Cool news: our extended Riemannian Gaussian VFM paper is out! 🔮 We define and study a variational objective for probability flows 🌀 on manifolds with closed-form geodesics. @FEijkelboom @a_ppln @CongLiu202212 @wellingmax @jwvdm @erikjbekkers 🔥 📜 https://t.co/PE6I6YcoTn

3

65

27

29

15K

Ke Wang @wangkeml

9 months ago

Bit late for the announcements but very happy to share that MEMOIR is accepted to Neurips 2025🎉! Great collaboration with @qinym710 @nikdimitriadis, @alesfav, @pafrossard! See you in San diego!

Yiming Qin @qinym710

about 1 year ago

How can we inject new knowledge into LLMs without full retraining, forgetting, or breaking past edits? We introduce MEMOIR 📖— a scalable framework for lifelong model editing that reliably rewrites thousands of facts sequentially using a residual memory module. 🔥 🧵1/7

qinym710's tweet photo. How can we inject new knowledge into LLMs without full retraining, forgetting, or breaking past edits?

We introduce MEMOIR 📖— a scalable framework for lifelong model editing that reliably rewrites thousands of facts sequentially using a residual memory module. 🔥

🧵1/7 https://t.co/vENKHsotQv

26

1K

170

1K

111K

0

8

4

1

2K

wangkeml retweeted

Maksym Andriushchenko

@maksym_andr

11 months ago

🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start in Fall 2025 (i.e., as soon as possible) and through centralized programs like CLS, IMPRS, and ELLIS (the deadlines are in November) to start in Spring–Fall 2026. I'm also searching for postdocs, master's thesis students, and research interns. Fill the Google form below if you're interested! Research group. We will focus on developing algorithmic solutions to reduce harms from advanced general-purpose AI models. We're particularly interested in alignment of autonomous LLM agents, which are becoming increasingly capable and pose a variety of emerging risks. We're also interested in rigorous AI evaluations and informing the public about the risks and capabilities of frontier AI models. Additionally, we aim to advance our understanding of how AI models generalize, which is crucial for ensuring their steerability and reducing associated risks. For more information about research topics relevant to our group, please check the following documents: - International AI Safety Report, - An Approach to Technical AGI Safety and Security by DeepMind, - Open Philanthropy’s 2025 RFP for Technical AI Safety Research. Research style. We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact: this can be papers (and NeurIPS/ICML/ICLR are great venues), but also open-source repositories, benchmarks, blog posts, even social media posts—literally anything that can be genuinely useful for other researchers and the general public. Broader vision. Current machine learning methods are fundamentally different from what they used to be pre-2022. The Bitter Lesson summarized and predicted this shift very well back in 2019: "general methods that leverage computation are ultimately the most effective". Taking this into account, we are only interested in studying methods that are general and scale with intelligence and compute. Everything that helps to advance their safety and alignment with societal values is relevant to us. We believe getting this—some may call it "AGI"—right is one of the most important challenges of our time. Join us on this journey!

maksym_andr's tweet photo. 🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨

Hiring. I'm looking for multiple PhD students: both those able to start in Fall 2025 (i.e., as soon as possible) and through centralized programs like CLS, IMPRS, and ELLIS (the deadlines are in November) to start in Spring–Fall 2026. I'm also searching for postdocs, master's thesis students, and research interns. Fill the Google form below if you're interested!

Research group. We will focus on developing algorithmic solutions to reduce harms from advanced general-purpose AI models. We're particularly interested in alignment of autonomous LLM agents, which are becoming increasingly capable and pose a variety of emerging risks. We're also interested in rigorous AI evaluations and informing the public about the risks and capabilities of frontier AI models. Additionally, we aim to advance our understanding of how AI models generalize, which is crucial for ensuring their steerability and reducing associated risks. For more information about research topics relevant to our group, please check the following documents:
- International AI Safety Report,
- An Approach to Technical AGI Safety and Security by DeepMind,
- Open Philanthropy’s 2025 RFP for Technical AI Safety Research.

Research style. We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact: this can be papers (and NeurIPS/ICML/ICLR are great venues), but also open-source repositories, benchmarks, blog posts, even social media posts—literally anything that can be genuinely useful for other researchers and the general public.

Broader vision. Current machine learning methods are fundamentally different from what they used to be pre-2022. The Bitter Lesson summarized and predicted this shift very well back in 2019: "general methods that leverage computation are ultimately the most effective". Taking this into account, we are only interested in studying methods that are general and scale with intelligence and compute. Everything that helps to advance their safety and alignment with societal values is relevant to us. We believe getting this—some may call it "AGI"—right is one of the most important challenges of our time.

Join us on this journey!

76

837

90

293

106K

wangkeml retweeted

Skander Moalla @SkanderMoalla

12 months ago

🚀 Big time! We can finally do LLM RL fine-tuning with rewards and leverage offline/off-policy data! ❌ You want rewards, but GRPO only works online? ❌ You want offline, but DPO is limited to preferences? ✅ QRPO can do both! 🧵Here's how we do it:

SkanderMoalla's tweet photo. 🚀 Big time! We can finally do LLM RL fine-tuning with rewards and leverage offline/off-policy data!

❌ You want rewards, but GRPO only works online?
❌ You want offline, but DPO is limited to preferences?
✅ QRPO can do both!

🧵Here's how we do it: https://t.co/H2k07GbFne

3

145

37

156

25K

wangkeml retweeted

Manuel Madeira @manuelmlmadeira

12 months ago

Excited to present #DeFoG with @qinym710 at #ICML2025 ! Catch our oral today at 3:30 PM (West Exhibition Hall C) and join us at the poster after (4:30–7:00 PM, East Exhibition Hall A-B #E-3004). Come chat graphs & generative models!

1

10

4

0

701

wangkeml retweeted

Yiming Qin @qinym710

12 months ago

🚀 Presenting #DeFoG: our discrete flow‑matching framework for graph generation! Catch our #ICML2025 oral presentation today (3:30 – 3:45 PM, in West Exhibition Hall C) and drop by the poster right after (4:30 –7:00). Come chat graphs & generative models! @manuelmlmadeira

1

11

3

0

1K

wangkeml retweeted

Vincent Jung @jungvinc

about 1 year ago

🧬 New roadmap out in Nature Reviews Molecular Cell Biology! 🤖 We show how RNA-LMs + GNNs can come together to model the RNA interactome & uncover new roles for non-coding RNA. 💊 Clinical links to RNA therapies for cancer & neuro diseases. 📄 Read it: https://t.co/JICDv1LRd9

0

12

5

2

1K

Ke Wang @wangkeml

about 1 year ago

@_simonsmith In contrast, our method updates the model weights directly to inject knowledge, enabling it to generalize across semantically related prompts, as demonstrated by our SOTA performance on the generalization metric.

0

41

Ke Wang

@wangkeml

Last Seen Users on Sotwe

Trends for you

Most Popular Users