Rasool Fakoor

1 day ago

@TsendeeMTS @ChangHao564792d Great paper. We also studied and proposed some form of DAgger style training for LLM but for SFT using scheduled sampling. https://t.co/rSFTBBfLIq

0

16

rasoolfa retweeted

BosonAI

@boson_ai

1 day ago

Higgs Audio v3 TTS is here. Built for voice AI that speaks, not just reads: • 100 languages with single-digit WER/CER • inline control over emotion, style, prosody, and sound effects • API, Workspace, and open weights • Blog 👉 https://t.co/C8frDlfO5D Watch the demo 👇

13

360

57

346

40K

MSL@Meta. I led PoT, MMMU, MMLU-Pro, MAmmoTH, General-Reasoner, VL-Rethinker, Pixel-Reasoner. I contributed to Gemini-2.5. Prev @GoogleDeepMind.

3 days ago

@cwolferesearch I'd add https://t.co/bdzM4JJbIO to your list! I've implemented it in very modular & clean way so system stays systems and algorithm stays alg. It helps to understand how RL training works without requiring understanding entire stack. Take a look and you will see the difference!

0

1

0

69

Who to follow

Wenhu Chen

@WenhuChen

Nan Jiang

@nanjiang_cs

machine learning researcher, with focus on reinforcement learning. assoc prof @ uiuc cs. Course on RL theory (w/ videos): https://t.co/vqVKwY4RJE

Siddharth Karamcheti

@siddkaramcheti

Incoming Assistant Professor @GeorgiaTech / @ICatGT / @GTrobotics. Prev: PhD from @StanfordAILab @stanfordnlp. I like robots, language, and people.

3 days ago

"agentic" should not require fundamentally different training stacks, as long as the framework easily supports env. Checkout https://t.co/30CAbnxwIn It is a clean & modular RL framework. We'll add env example soon, but the core training/rollout are the same. U r welcome to contribute tho!

0

1

0

43

3 days ago

code: https://t.co/FHj9y2TCyo paper: https://t.co/XGtDYC2XNK

0

96

3 days ago

Off-policy data does not have to be a bug in RL. In our work, we shift the question from: Is this data on-policy? -> How much should we trust this batch? That change leads to a adaptive objective for RL LLM-training. blog: https://t.co/HIJUbi7hUg

2

4

0

195

rasoolfa retweeted

Alex Smola

@smolix

3 days ago

New blog post: Effective Sample Size Reweighting data to fix distribution shift kills bias but piles everything onto a few points. ESS measures how much data you actually have left, the dial for when a replay buffer goes stale. https://t.co/YE6bzhVw64

smolix's tweet photo. New blog post: Effective Sample Size

Reweighting data to fix distribution shift kills bias but piles everything onto a few points. ESS measures how much data you actually have left, the dial for when a replay buffer goes stale.
https://t.co/YE6bzhVw64 https://t.co/CHTXf0Ggto

0

5

1

2

551

rasoolfa retweeted

10 days ago

Shoutout to my fantastic co-organizers for making the first-ever workshop on RL Environments & Agent Evals such a success! @rasoolfa, @anishathalye, @aagohary, @natashajaques, @TheAndiPenguin, @migballesteros, Aziza Mirsaidova, Priyaranjan Pattnayak, Ahmed Elgohary, Alina Gavrilov, Aparna Elangovan, Graham Horwood

jomulr's tweet photo. Shoutout to my fantastic co-organizers for making the first-ever workshop on RL Environments & Agent Evals such a success!

@rasoolfa, @anishathalye, @aagohary, @natashajaques, @TheAndiPenguin, @migballesteros, Aziza Mirsaidova, Priyaranjan Pattnayak, Ahmed Elgohary, Alina Gavrilov, Aparna Elangovan, Graham Horwood

0

7

2

1

319

rasoolfa retweeted

10 days ago

Packed room to hear @alexgshaw and @ryanmart3n break down how @harborframework grew into *the* framework for RL environments. In our RLEval workshop at @CAISconf today, attendees tackled big open challenges in RLEs & Agent Evals + I shared the approach we take at @joinHandshake

jomulr's tweet photo. Packed room to hear @alexgshaw and @ryanmart3n break down how @harborframework grew into *the* framework for RL environments.

In our RLEval workshop at @CAISconf today, attendees tackled big open challenges in RLEs & Agent Evals + I shared the approach we take at @joinHandshake https://t.co/dfRbcJafg3

2

33

10

3

6K

11 days ago

Join us tmrw if you are around at RLEval, first edition ever. Together with my wonderful co-organizers (@jomulr , @anishathalye, Alina Gavrilov, Aziza Mirsaidova) we have put together an exciting program with a great lineup of speakers and papers. See the full program here: https://t.co/JZd9U2SzlH ps: I’ll also give a talk on what is wrong with current RL methods and frameworks, and why these issues can slow progress in using RL for large-model.

Alex Smola

@smolix

11 days ago

Tomorrow in San Jose: RLEval. Trillions going into LLM agents and we still cannot reliably evaluate them. 19 papers, talks from Alex Dimakis, Corby Rosset, Rasool Fakoor, and others. I'll be presenting Submodular Benchmark Selection from @boson_ai. https://t.co/WksVTSirZp

1

27

2

22

4K

0

4

0

224

29 days ago

@vivek_2332 @adithya_s_k have u tried https://t.co/bdzM4JJbIO? I've released it for very same reasons (your 2nd and 3rd items). try it and lemme know what u think!

0

14

rasoolfa retweeted

about 1 month ago

@adithya_s_k Super useful resource, thank you for putting it together! Researchers working on RLE design & Agent Evals might consider submitting papers / attending the first-ever Workshop in this area at the upcoming ACM Conference on AI and Agentic Systems: https://t.co/FpWtMJsnv1

jomulr's tweet photo. @adithya_s_k Super useful resource, thank you for putting it together!

Researchers working on RLE design & Agent Evals might consider submitting papers / attending the first-ever Workshop in this area at the upcoming ACM Conference on AI and Agentic Systems:

https://t.co/FpWtMJsnv1 https://t.co/CBvZFc8rVP

1

11

4

5

2K

about 1 month ago

Once you check it out, you’ll see the difference immediately. As #ICLR2026 wraps up, this might be a good starting point for your next idea, startup, project, or conference submission.

0

1

0

174

about 1 month ago

Too many RL ideas die at the edge of the LLM/VLM/VLA training stack. Not anymore. With FeynRL, new algorithms ideas do not have to fight the whole stack 🚀. Focus on the alg while still training very large models. https://t.co/30CAbnxwIn Try it, 🌟 it, send feedback.

1

6

2

1

657

about 1 month ago

One thing I keep hearing is that RL for L(L)Ms is "mostly a systems problem now" and the RL part is basically good enough. I really don’t buy that. Current RL algs are still fragile as hell. Better systems help, but they don’t magically make the RL problem go away.

0

2

0

165

rasoolfa retweeted

about 2 months ago

Our organizing team is excited for a productive day of discussion with you on May 26: @natashajaques, @TheAndiPenguin, @rasoolfa, @anishathalye, @migballesteros, @aagohary, Aziza Mirsaidova, Priyaranjan Pattnayak, Ahmed Elgohary, Alina Gavrilov, Aparna Elangovan, Graham Horwood

jomulr's tweet photo. Our organizing team is excited for a productive day of discussion with you on May 26:
@natashajaques, @TheAndiPenguin, @rasoolfa, @anishathalye, @migballesteros, @aagohary,
Aziza Mirsaidova, Priyaranjan Pattnayak, Ahmed Elgohary, Alina Gavrilov, Aparna Elangovan, Graham Horwood https://t.co/ITFD5gQeKT

0

3

1

0

255

about 2 months ago

Are you working on RL, principled ways to build RL envs for agent training, or effective evaluation for agents? Want to showcase your NeurIPS submission? or just discuss about research more broadly? Then consider submitting and attending to our first ever workshop on Methods and RL Environments for Evaluating AI Agents. Deadline: May 11 https://t.co/MOxooUom5i

about 2 months ago

📢 Call for papers: Workshop on Methods and Reinforcement Learning Environments for Evaluating AI Agents @ ACM CAIS 2026 (inaugural edition!) Topics include: - Design principles for effective RL Environments - Methods to evaluate Agents, esp. causal/interventional techniques

jomulr's tweet photo. 📢 Call for papers: Workshop on Methods and Reinforcement Learning Environments for Evaluating AI Agents @ ACM CAIS 2026 (inaugural edition!)

Topics include:
- Design principles for effective RL Environments
- Methods to evaluate Agents, esp. causal/interventional techniques https://t.co/Ixx3UAusAA

1

8

3

2

7K

0

7

1

2

1K

2 months ago

@novasarc01 @oneill_c Well, we released one but we want to focus back on RL rather than on system. The goal is to provide a clean framework that people understand and build new RL alg without having to deal with a convoluted code. Take a look and you'll see the difference https://t.co/bdzM4JJbIO

0

33