Tao Feng @taofeng_uiuc - Twitter Profile

Pinned Tweet

about 2 months ago

1/ 🔥 Introducing RouteProfile to systematically study LLM profile design for routing! 🖥️ Installation: pip install routeprofile 🔧 Project Page: https://t.co/G8NcqOsA2M 🌟 GitHub Repo: https://t.co/pqKPl6eyPp 🤗 HuggingFace Collection: https://t.co/eBsjE3XCcF 📚 Paper: https://t.co/EeX7aipf4u RouteProfile is a general framework for designing LLM profiles for routing. It formulates LLM profiling as a structured information integration problem over heterogeneous interaction histories, enabling more principled and effective routing across queries, domains, and models.

2

9

4

1

535

taofeng_uiuc retweeted

Chongrui Ye

@chongrui28836

about 1 month ago

Introducing Auto-Dreamer 🧠💤 A research counterpart to @AnthropicAI's "Dreaming" for Claude Managed Agents, exploring the same idea: agents that consolidate their experience offline into compact, reusable memory. We trained the consolidator with RL — shrinking the active memory bank 6-11× while gaining task success. 💥 Key Result: Beats 10 memory baselines on ScienceWorld + ALFWorld + WebArena, including RL-trained writers Mem-α and UMEM, with an order-of-magnitude smaller bank. Auto-Dreamer is a two-timescale memory system inspired by complementary learning systems: - A fast Writer appends entries online after each trajectory. - A slow Consolidator wakes every k sessions, rewrites a region of the bank into compact synthesized entries via tool-use rollouts. - Trained with GRPO + a counterfactual utility reward that scores entries by how much they actually help downstream retrieval. Trained only on ScienceWorld, the consolidator transfers zero-shot to ALFWorld and WebArena — best in class on both. Paper: https://t.co/l77XNVWC7D Code release coming soon. Advised by @youjiaxuan @McAuleyLabUCSD @GeLiuSaber 🧵 More figures below ↓

chongrui28836's tweet photo. Introducing Auto-Dreamer 🧠💤

A research counterpart to @AnthropicAI's "Dreaming" for Claude Managed Agents, exploring the same idea: agents that consolidate their experience offline into compact, reusable memory.

We trained the consolidator with RL — shrinking the active memory bank 6-11× while gaining task success.

💥 Key Result: Beats 10 memory baselines on ScienceWorld + ALFWorld + WebArena, including RL-trained writers Mem-α and UMEM, with an order-of-magnitude smaller bank.

Auto-Dreamer is a two-timescale memory system inspired by complementary learning systems:

- A fast Writer appends entries online after each trajectory.
- A slow Consolidator wakes every k sessions, rewrites a region of the bank into compact synthesized entries via tool-use rollouts.
- Trained with GRPO + a counterfactual utility reward that scores entries by how much they actually help downstream retrieval.

Trained only on ScienceWorld, the consolidator transfers zero-shot to ALFWorld and WebArena — best in class on both.

Paper: https://t.co/l77XNVWC7D
Code release coming soon.

Advised by @youjiaxuan @McAuleyLabUCSD @GeLiuSaber

🧵 More figures below ↓

14

233

41

287

35K

taofeng_uiuc retweeted

Keyang Xuan

@keyang_xuan

about 1 month ago

AI evaluation is entering an interactive benchmark era. Across tool-use agents, web/OS benchmarks, multi-agent systems, and reliability evaluations, interaction is becoming central to how modern AI systems are tested. But the field risks adding interaction faster than it develops the scientific principles for evaluating interaction. Our position: Interactive evaluation is not just longer tasks, tool use, or multi-turn interaction. It requires a design science for mapping trajectories to valid evaluative claims. 📄 https://t.co/lKGuDOuBZy 💻 https://t.co/LkadiPYnnw

keyang_xuan's tweet photo. AI evaluation is entering an interactive benchmark era.

Across tool-use agents, web/OS benchmarks, multi-agent systems, and reliability evaluations, interaction is becoming central to how modern AI systems are tested.

But the field risks adding interaction faster than it develops the scientific principles for evaluating interaction.

Our position:
Interactive evaluation is not just longer tasks, tool use, or multi-turn interaction.
It requires a design science for mapping trajectories to valid evaluative claims.

📄 https://t.co/lKGuDOuBZy
💻 https://t.co/LkadiPYnnw

13

108

30

72

15K

Tao Feng

@taofeng_uiuc

about 2 months ago

6/ 🙌 Huge thanks to out main contributor @jingjunxu1128 , @JPJP3648 , and @haozhen_ntu. Special thanks to our advisors @GeLiuSaber and @youjiaxuan for their guidance and support!

1

2

1

0

164

Tao Feng

@taofeng_uiuc

about 2 months ago

1/ 🔥 Introducing RouteProfile to systematically study LLM profile design for routing! 🖥️ Installation: pip install routeprofile 🔧 Project Page: https://t.co/G8NcqOsA2M 🌟 GitHub Repo: https://t.co/pqKPl6eyPp 🤗 HuggingFace Collection: https://t.co/eBsjE3XCcF 📚 Paper: https://t.co/EeX7aipf4u RouteProfile is a general framework for designing LLM profiles for routing. It formulates LLM profiling as a structured information integration problem over heterogeneous interaction histories, enabling more principled and effective routing across queries, domains, and models.

2

9

4

1

535

Tao Feng

@taofeng_uiuc

about 2 months ago

5/ 🧪 We evaluate across SimRouter, MLPRouter, and GraphRouter under standard and cold-start (new-LLM) settings. 3 key findings: (1) Structured profiles consistently beat flat ones — across all routers (2) Query-level signals are more reliable than coarse domain-level ones (3) Cold-start generalization needs structured + trainable profile configurations most

1

0

92

taofeng_uiuc retweeted

Chumeng Liang

@lowerbad

2 months ago

Continuous diffusion dominates image & video generation, but people used to believe that it inherently lags behind its discrete counterparts in language modeling. Today, we challenge this belief with LangFlow: the first continuous diffusion language model that rivals—and even beats—discrete diffusion. (1/7) Blog: https://t.co/EtZRSx9MQv GitHub: https://t.co/NgWUDDAXd6 Arxiv: https://t.co/2WfaQL7IZZ

7

178

30

134

24K

Tao Feng

@taofeng_uiuc

3 months ago

Huge thanks to our first author, @PeterLuo2077 and our amazing supervisors, @youjiaxuan and @GeLiuSaber.

0

94

Tao Feng

@taofeng_uiuc

3 months ago

🔥 Introducing MemReward — open-sourced! 📄 https://t.co/rcH133YcfP 🚀 https://t.co/K4blvj3ss2 🤗 https://t.co/sujXKtREu1 A graph-based experience memory framework that achieves near-Oracle RL fine-tuning with only 20% reward labels — and even surpasses full supervision on out-of-domain tasks. 💸 The Problem RL fine-tuning for LLMs requires reward labels for every rollout. But labels are expensive: math proofs need expert review, open-ended QA lacks ground truth, code verification is slow. Label everything? Too costly. Label only 20%? Performance drops sharply. 🧠 How MemReward Works We organize queries, chain-of-thought, and answers into a heterogeneous graph. A GNN trained on 20% labeled rollouts propagates reward signals to the remaining 80%. Labeled rollouts → ground-truth rewards Unlabeled rollouts → GNN-predicted rewards Combined → efficient GRPO training 📊 Key Results On Qwen2.5-3B / 1.5B across 13 benchmarks (math, QA, code): • 20% labels → 97.3% of Oracle performance (3B) • Surpasses Oracle on OOD tasks: 66.96 vs 66.07 (3B), 62.81 vs 62.00 (1.5B) • At 70% labels → 99.4% of Oracle • Math reasoning benefits most: GSM8K +11.6, GSM-Sym +14.9 (1.5B) ✅ Heterogeneous graph with query-query, query-thinking, thinking-answer edges ✅ Cross-domain GNN: joint training, zero-shot OOD generalization ✅ Plug-and-play: drop-in replacement for reward sources in GRPO ✅ Smooth scaling: more labels → better results, starting from just 20% Label less. Learn more. 🚀 If you find this useful, please give us a ⭐ on GitHub! #LLM #Memory #Opensource

taofeng_uiuc's tweet photo. 🔥 Introducing MemReward — open-sourced!

📄 https://t.co/rcH133YcfP
🚀 https://t.co/K4blvj3ss2
🤗 https://t.co/sujXKtREu1

A graph-based experience memory framework that achieves near-Oracle RL fine-tuning with only 20% reward labels — and even surpasses full supervision on out-of-domain tasks.

💸 The Problem

RL fine-tuning for LLMs requires reward labels for every rollout. But labels are expensive:
math proofs need expert review, open-ended QA lacks ground truth, code verification is slow.
Label everything? Too costly. Label only 20%? Performance drops sharply.

🧠 How MemReward Works

We organize queries, chain-of-thought, and answers into a heterogeneous graph. A GNN trained on 20% labeled rollouts propagates reward signals to the remaining 80%.

Labeled rollouts → ground-truth rewards
Unlabeled rollouts → GNN-predicted rewards
Combined → efficient GRPO training

📊 Key Results

On Qwen2.5-3B / 1.5B across 13 benchmarks (math, QA, code):

• 20% labels → 97.3% of Oracle performance (3B)
• Surpasses Oracle on OOD tasks: 66.96 vs 66.07 (3B), 62.81 vs 62.00 (1.5B)
• At 70% labels → 99.4% of Oracle
• Math reasoning benefits most: GSM8K +11.6, GSM-Sym +14.9 (1.5B)

✅ Heterogeneous graph with query-query, query-thinking, thinking-answer edges
✅ Cross-domain GNN: joint training, zero-shot OOD generalization
✅ Plug-and-play: drop-in replacement for reward sources in GRPO
✅ Smooth scaling: more labels → better results, starting from just 20%

Label less. Learn more. 🚀

If you find this useful, please give us a ⭐ on GitHub!

#LLM #Memory #Opensource

1

9

2

7

916

taofeng_uiuc retweeted

Vuk Rosić 武克

@VukRosic99

4 months ago

You can do AI researach on JEPA on 1 GPU, no more "i don't have GPUs" excuse - https://t.co/Hc4Qc9lSIm

6

664

53

597

38K

Tao Feng

@taofeng_uiuc

4 months ago

Yes, it's about 3-4 times faster than a single-round router. However, the training of an agency router can be optimized. If the user considers both latency and task performance, the agency router should learn to perform single-round routing for simple queries and routing based on the agency mechanism for complex queries. Stay tuned to our project; we will be releasing some of the latest experimental results soon.

0

24

Tao Feng

@taofeng_uiuc

4 months ago

We release OpenClaw Router — Production-Ready LLM Routing 🚀 Code: https://t.co/FnZ0YdCmBk📦 PyPI: https://t.co/Jqe6Q6HowA 🔥 Meet OpenClaw Router Deploy LLMRouter as an OpenAI-compatible API with one command. Seamlessly integrate with Slack, Discord, WhatsApp via OpenClaw. Support multimodal understanding — route based on images, audio, video, not just text. pip install llmrouter-lib && llmrouter serve 💸 Why OpenClaw Router? Why pay GPT-5 prices for "What's the weather?" Smart routing = Significant Token Savings. Simple query → Cheap model. Complex query → Powerful model. Save 30-50% on inference costs without sacrificing quality. 🧠 Train Your Own Router OpenClaw Router isn't just a server — it's a learning system. Train personalized routers on your own data, tailored to your domain. Every user feedback, every usage pattern feeds back into router training — continuously iterate toward a more user-friendly, cost-efficient OpenClaw. ✅ Routing Memory: RAG-powered decisions that learn from history ✅ Personalized Routing: Adapts to individual user preferences ✅ Feedback Loop: User interactions improve routing over time ✅ 16+ Strategies: KNN, SVM, MLP, BERT, Graph, RL, Agentic — switch with one flag Route smarter. Train your own. Save more. 🚀

7

36

10

51

22K

Tao Feng

@taofeng_uiuc

4 months ago

Thanks to our main contributor @liangqi_yuan, and other contributors @haozhen_ntu, @lei_zijie. Special thanks to our advisors Prof. @GeLiuSaber and Prof. @youjiaxuan for their guidance and support! 🙏

0

1

0

348

Tao Feng

@taofeng_uiuc

4 months ago

🎨 Introducing ComfyUI Interface for LLMRouter - Build your entire LLM routing pipeline visually! We're excited to release a powerful visual interface that transforms how you work with LLMRouter. No more YAML configs or terminal scripts - just drag, drop, and connect nodes. 🔗 Project: https://t.co/0NDJuuwN4Y 🎨 ComfyUI: https://t.co/but9AaKoXX

2

9

6

3

6K

Tao Feng

@taofeng_uiuc

4 months ago

5/5 ✨ Smart Features • Intelligent caching: Skips regeneration when config unchanged • Real-time monitoring via PreviewAny nodes • Pre-configured example workflow included • Multimodal support: Video understanding (Charades-Ego) • Works with any ComfyUI setup

1

0

196

Tao Feng

@taofeng_uiuc

Last Seen Users on Sotwe

Trends for you

Most Popular Users