Do you have work on resources, metrics & methodologies for evaluating multilingual systems?
Share it at the MME workshop🕵️co-located at EACL. Direct submission deadline in 10 days! (December 19th)!
https://t.co/RhG5cPr1fW
📢 Announcing the First Workshop on Multilingual and Multicultural Evaluation (MME) — co-located with #EACL2026 🇲🇦
📅 Mar 24–29, 2026 | Rabat, Morocco
MME focuses on resources, metrics & methodologies for evaluating multilingual systems! https://t.co/wpYQcebro0
🗓️ Submit by Dec 19, 2025
🔥 Thrilled to introduce DuPO (Dual Learning-based Preference Optimization)
- DuPO enables LLMs to get reliable and scalable self-supervision through duality-derived rewards.
- General application in various tasks, eg, math reasoning and multingual translation.
- Strong performance on various backbones, excelling both as a reward for training and as a reranker for inference.
🤗 Paper: https://t.co/hlPHtojLb2
📝 Blog: https://t.co/PxVQjRhzag
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
"We present DuPO, a dual learning-based preference optimization framework that generates annotation-free feedback via a generalized duality"
"DuPO decomposes a primal task’s input into known and unknown components, then constructs its dual task to reconstruct the unknown part using the primal output and known information (e.g., reversing math solutions to recover hidden variables"
@sarahookr@Cohere_Labs@cohere You’ve done an incredible job leading Cohere and empowering the multilingual community. Wishing you the best in your next adventure!
Bytedance just dropped realtime voice translation 3x faster than before, with only a ~3s lag!
Seed LiveInterp 2 is a full duplex speech-to-speech model with >70% correctness.
When this makes it to video calls, it'll open up previously impossible connections.
Not a social media/ X person, but still glad to announce Seed LiveInterpret 2.0. In short, it is an end-to-end, full duplex speech-to-speech simultaneous interpretation model. Achieves high-quality, ultra-low latency S2S translation.
Website: https://t.co/PI6qiCOJs8
@devoto_alessio@PMinervini@simeng_ssun I'd like to recommend our LongBioBench as well! It supports infinite-length evaluation and enables controllable examination. https://t.co/jubYGZ8AsH
🚀 Call for Papers — @NeurIPSConf 2025 Workshop
Multi-Turn Interactions in LLMs
📅 December 6/7 · 📍 San Diego Convention Center
Join us to shape the future of interactive AI. Topics include but are not limited to:
🧠 Multi-Turn RL for Agentic Tasks (e.g., web & GUI agents, tool use)
🤝 Human-AI Interaction over time
🛡️ Alignment across extended interactions
📏 Evaluation of long-horizon tasks
🧩 Social learning, Open-Endedness, trust, and more
🌟 Featuring an all-star speaker lineup:
Dawn Song @dawnsongtweets (UC Berkeley)
Jason Weston @jaseweston (Meta FAIR)
Natasha Jaques @natashajaques (University of Washington & Google DeepMind)
Tim Rocktäschel @_rockt (UCL & DeepMind, tentative)
Diyi Yang @Diyi_Yang (Stanford)
Peter Henderson @PeterHndrsn (Princeton)
Yu Su @ysu_nlp (OSU)
Hannah Rose Kirk @hannahrosekirk (Oxford)
📣 Updates: Follow us here & spread the word!
#NeurIPS2025 #LLMs #AIAlignment #MultiAgent #ReinforcementLearning #LanguageAgents #InteractiveAI
🚀 Introducing Prefix-RFT to blend SFT and RFT!
SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!
@Elaina43114880@teortaxesTex 晚些时候可以进行在线测试,敬请期待!
The service will be available soon. Stay tuned!
Currently, we recommend deploying Seed-X on your own device with our released weights.