1/ 🔥 Introducing RouteProfile to systematically study LLM profile design for routing!
🖥️ Installation: pip install routeprofile
🔧 Project Page: https://t.co/G8NcqOsA2M
🌟 GitHub Repo: https://t.co/pqKPl6eyPp
🤗 HuggingFace Collection: https://t.co/eBsjE3XCcF
📚 Paper: https://t.co/EeX7aipf4u
RouteProfile is a general framework for designing LLM profiles for routing. It formulates LLM profiling as a structured information integration problem over heterogeneous interaction histories, enabling more principled and effective routing across queries, domains, and models.
Introducing Auto-Dreamer 🧠💤
A research counterpart to @AnthropicAI's "Dreaming" for Claude Managed Agents, exploring the same idea: agents that consolidate their experience offline into compact, reusable memory.
We trained the consolidator with RL — shrinking the active memory bank 6-11× while gaining task success.
💥 Key Result: Beats 10 memory baselines on ScienceWorld + ALFWorld + WebArena, including RL-trained writers Mem-α and UMEM, with an order-of-magnitude smaller bank.
Auto-Dreamer is a two-timescale memory system inspired by complementary learning systems:
- A fast Writer appends entries online after each trajectory.
- A slow Consolidator wakes every k sessions, rewrites a region of the bank into compact synthesized entries via tool-use rollouts.
- Trained with GRPO + a counterfactual utility reward that scores entries by how much they actually help downstream retrieval.
Trained only on ScienceWorld, the consolidator transfers zero-shot to ALFWorld and WebArena — best in class on both.
Paper: https://t.co/l77XNVWC7D
Code release coming soon.
Advised by @youjiaxuan@McAuleyLabUCSD@GeLiuSaber
🧵 More figures below ↓
AI evaluation is entering an interactive benchmark era.
Across tool-use agents, web/OS benchmarks, multi-agent systems, and reliability evaluations, interaction is becoming central to how modern AI systems are tested.
But the field risks adding interaction faster than it develops the scientific principles for evaluating interaction.
Our position:
Interactive evaluation is not just longer tasks, tool use, or multi-turn interaction.
It requires a design science for mapping trajectories to valid evaluative claims.
📄 https://t.co/lKGuDOuBZy
💻 https://t.co/LkadiPYnnw
1/ 🔥 Introducing RouteProfile to systematically study LLM profile design for routing!
🖥️ Installation: pip install routeprofile
🔧 Project Page: https://t.co/G8NcqOsA2M
🌟 GitHub Repo: https://t.co/pqKPl6eyPp
🤗 HuggingFace Collection: https://t.co/eBsjE3XCcF
📚 Paper: https://t.co/EeX7aipf4u
RouteProfile is a general framework for designing LLM profiles for routing. It formulates LLM profiling as a structured information integration problem over heterogeneous interaction histories, enabling more principled and effective routing across queries, domains, and models.
5/ 🧪 We evaluate across SimRouter, MLPRouter, and GraphRouter under standard and cold-start (new-LLM) settings.
3 key findings:
(1) Structured profiles consistently beat flat ones — across all routers
(2) Query-level signals are more reliable than coarse domain-level ones
(3) Cold-start generalization needs structured + trainable profile configurations most
Continuous diffusion dominates image & video generation, but people used to believe that it inherently lags behind its discrete counterparts in language modeling.
Today, we challenge this belief with LangFlow: the first continuous diffusion language model that rivals—and even beats—discrete diffusion. (1/7)
Blog: https://t.co/EtZRSx9MQv
GitHub: https://t.co/NgWUDDAXd6
Arxiv: https://t.co/2WfaQL7IZZ
🔥 Introducing MemReward — open-sourced!
📄 https://t.co/rcH133YcfP
🚀 https://t.co/K4blvj3ss2
🤗 https://t.co/sujXKtREu1
A graph-based experience memory framework that achieves near-Oracle RL fine-tuning with only 20% reward labels — and even surpasses full supervision on out-of-domain tasks.
💸 The Problem
RL fine-tuning for LLMs requires reward labels for every rollout. But labels are expensive:
math proofs need expert review, open-ended QA lacks ground truth, code verification is slow.
Label everything? Too costly. Label only 20%? Performance drops sharply.
🧠 How MemReward Works
We organize queries, chain-of-thought, and answers into a heterogeneous graph. A GNN trained on 20% labeled rollouts propagates reward signals to the remaining 80%.
Labeled rollouts → ground-truth rewards
Unlabeled rollouts → GNN-predicted rewards
Combined → efficient GRPO training
📊 Key Results
On Qwen2.5-3B / 1.5B across 13 benchmarks (math, QA, code):
• 20% labels → 97.3% of Oracle performance (3B)
• Surpasses Oracle on OOD tasks: 66.96 vs 66.07 (3B), 62.81 vs 62.00 (1.5B)
• At 70% labels → 99.4% of Oracle
• Math reasoning benefits most: GSM8K +11.6, GSM-Sym +14.9 (1.5B)
✅ Heterogeneous graph with query-query, query-thinking, thinking-answer edges
✅ Cross-domain GNN: joint training, zero-shot OOD generalization
✅ Plug-and-play: drop-in replacement for reward sources in GRPO
✅ Smooth scaling: more labels → better results, starting from just 20%
Label less. Learn more. 🚀
If you find this useful, please give us a ⭐ on GitHub!
#LLM #Memory #Opensource
Yes, it's about 3-4 times faster than a single-round router. However, the training of an agency router can be optimized. If the user considers both latency and task performance, the agency router should learn to perform single-round routing for simple queries and routing based on the agency mechanism for complex queries. Stay tuned to our project; we will be releasing some of the latest experimental results soon.
We release OpenClaw Router — Production-Ready LLM Routing
🚀 Code: https://t.co/FnZ0YdCmBk📦 PyPI: https://t.co/Jqe6Q6HowA
🔥 Meet OpenClaw Router
Deploy LLMRouter as an OpenAI-compatible API with one command. Seamlessly integrate with Slack, Discord, WhatsApp via OpenClaw. Support multimodal understanding — route based on images, audio, video, not just text.
pip install llmrouter-lib && llmrouter serve
💸 Why OpenClaw Router?
Why pay GPT-5 prices for "What's the weather?" Smart routing = Significant Token Savings. Simple query → Cheap model. Complex query → Powerful model. Save 30-50% on inference costs without sacrificing quality.
🧠 Train Your Own Router
OpenClaw Router isn't just a server — it's a learning system. Train personalized routers on your own data, tailored to your domain. Every user feedback, every usage pattern feeds back into router training — continuously iterate toward a more user-friendly, cost-efficient OpenClaw.
✅ Routing Memory: RAG-powered decisions that learn from history
✅ Personalized Routing: Adapts to individual user preferences
✅ Feedback Loop: User interactions improve routing over time
✅ 16+ Strategies: KNN, SVM, MLP, BERT, Graph, RL, Agentic — switch with one flag
Route smarter. Train your own. Save more. 🚀
🎨 Introducing ComfyUI Interface for LLMRouter - Build your entire LLM routing pipeline visually!
We're excited to release a powerful visual interface that transforms how you work with LLMRouter. No more YAML configs or terminal scripts - just drag, drop, and connect nodes.
🔗 Project: https://t.co/0NDJuuwN4Y
🎨 ComfyUI: https://t.co/but9AaKoXX
5/5 ✨ Smart Features
• Intelligent caching: Skips regeneration when config unchanged
• Real-time monitoring via PreviewAny nodes
• Pre-configured example workflow included
• Multimodal support: Video understanding (Charades-Ego)
• Works with any ComfyUI setup