🦞 Calling all #OpenClaw builders, tinkerers, & #AI makers!
Meet others pushing their lobsters beyond prototypes — sharing the workflows, wins, and hard lessons of building real AI-driven value 🦾
🗓 Mar 28, 2–5PM | HKU iCube
💡 Free reg: https://t.co/71mxIB4gyE
#HKUCAMO
Train on tokens, infer on raw bytes — no tokenizer, no architecture changes. Proxy Compression shows byte-level LMs can match or beat tokenizer baselines at 7B/14B scale. One step closer to a tokenizer-free future.
Introducing proxy compression for end-to-end language modeling: train on compressed (e.g., tokenized) data for efficiency, but run inference entirely on raw bytes without a tokenizer. No architectural changes required. At scale, proxy-trained byte models match or surpass tokenizer baselines at 7B and 14B.
📄 Paper: https://t.co/4NGVagTocP
💻 Code: https://t.co/tPcbReJ915
[1/9]
🧵👇
📢Dec 22 (Mon): Diffusion Beats AR: Code Generation
Discrete diffusion models now rival autoregressive (AR) models on challenging coding benchmarks, making them a compelling alternative to AR models.
This Monday, Shansan Gong (@sansa19739319) will present recipes for training masked diffusion models to reach such coding performance, and will reveal several surprising inference-time behaviors of these models.
Paper: https://t.co/r9SBOKdgAh
Reptile introduces a promising approach to CLI agents by integrating human-in-the-loop feedback directly into the terminal workflow for future model training.🚀🚀🚀
🚀We propose Reptile, a Terminal Agent🤖️that enables interaction with an LLM agent directly in your terminal. The agent can execute any command or custom CLI tool to accomplish tasks, and users can define their own tools and commands for the agent to utilize.
✨What Makes Reptile Special?
Compared with other CLI agents (e.g., Claude Code and Mini SWE-Agent), Reptile stands out for the following reasons:
⚡️Human-in-the-Loop Learning: Users can inspect every step and provide prompt feedback, i.e., give feedback under the USER role or edit the LLM generation under the ASSISTANT role. The interaction will be used for model SFT training & RL training.
💻Terminal-only beyond Bash-only: Simple and stateful execution, which is more efficient than bash-only (you don’t need to specify the environment in every command). It doesn’t require the complicated MCP protocol—just a naive bash tool under the REPL protocol.
Github: https://t.co/AmrCJWA0Ls
Homepage: https://t.co/kK73JkQoi0
🚀We propose Reptile, a Terminal Agent🤖️that enables interaction with an LLM agent directly in your terminal. The agent can execute any command or custom CLI tool to accomplish tasks, and users can define their own tools and commands for the agent to utilize.
✨What Makes Reptile Special?
Compared with other CLI agents (e.g., Claude Code and Mini SWE-Agent), Reptile stands out for the following reasons:
⚡️Human-in-the-Loop Learning: Users can inspect every step and provide prompt feedback, i.e., give feedback under the USER role or edit the LLM generation under the ASSISTANT role. The interaction will be used for model SFT training & RL training.
💻Terminal-only beyond Bash-only: Simple and stateful execution, which is more efficient than bash-only (you don’t need to specify the environment in every command). It doesn’t require the complicated MCP protocol—just a naive bash tool under the REPL protocol.
Github: https://t.co/AmrCJWA0Ls
Homepage: https://t.co/kK73JkQoi0
We will have a guest talk from Cai Zhou. He is a second-year PhD in MIT EECS. "Continuous modeling in diffusion language models: HDLM and CCDD
". All are welcome to join via the following link.
https://t.co/ZlLDO5pKRH
What happend after Dream 7B?
First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data.
Plus, DreamOn cracks the variable-length generation problem! It enables code infilling that goes beyond a fixed canvas.
Jane Austen meets AI and Physiology? Don't know what I mean?🤔 You got 20m, I'll tell you🤫😀 Keynote:🌟Pride and Prejudice: AI meets Physiology Education🌟
Video https://t.co/SqCDq7wIKe
Slides https://t.co/4lIGOrixZ3
#NLProc@wing_nus@NUSComputing@nusaiinstitute
# 🚨 4B open-recipe model beats Claude-4-Opus
🔓 100% open data, recipe, model weights and code.
Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models.
🥳 Check out how we boost open-recipe reasoning models to incredible performance levels (65 → 79 on AIME25) through RL training on open-source data and academic-level resources.
📑Notion: https://t.co/k5ITJFzCe1
📗Blog post: https://t.co/Leth9PWSod
🤗Model & data: https://t.co/SVdfIwYTrU
💻Code: https://t.co/txg0qcywWi
🚀 Incredible work! PromptCoT-Mamba sets a new bar: the first constant-memory reasoning model outperforming Transformers on tough math & code benchmarks.
No attention, no KV cache — just pure efficient decoding.
🔥 Meet PromptCoT-Mamba
The first reasoning model with constant-memory inference to beat Transformers on competition-level math & code
⚡ Efficient decoding: no attention, no KV cache
⚡ +16.0% / +7.1% / +16.6% vs. s1.1-7B on AIME 24 / 25 / LiveCodeBench
🚀 Up to 3.66× faster
🔥 Meet PromptCoT-Mamba
The first reasoning model with constant-memory inference to beat Transformers on competition-level math & code
⚡ Efficient decoding: no attention, no KV cache
⚡ +16.0% / +7.1% / +16.6% vs. s1.1-7B on AIME 24 / 25 / LiveCodeBench
🚀 Up to 3.66× faster
🎉Introducing our latest work: "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"
🤗 Huggingface: https://t.co/56CJS6Qzg9
🏠Homepage: https://t.co/jU6mHlFIoU
TLDR: We introduce ScienceBoard, featuring (1) a dynamic OS env with real scientific software (CLI + GUI), and (2) a human-validated benchmark spanning domains like biochem, astronomy, GIS, ATP, and more.
🧵[1/5]
We are kicking off a series of seminars at @hkunlp2020. @siyan_zhao will be giving a talk titled "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning" at ⏰Friday 5.9 11am HKT (Thursday 5.8 8pm PDT). Link to talk: https://t.co/i9FsWYRNbZ
🚀 Meet PromptCoT-QwQ-32B, a breakthrough in mathematical reasoning! Outperforming all open-source models on AIME2024 and AIME2025, including Nemotron-Ultra-253B, DeepSeek-R1-671B, and QwQ-32B! 🔥
🤔 Always wondering if a next-token prediction model is the end of planning and reasoning.
🎯 Now excited to announce our team's latest research on exploring a new paradigm to enhance the planning ability of LLMs with DiffuSearch.
🧵1/7