Tsing

@Tsingggg

Toronto, Ontario

Joined August 2012

995 Following

115 Followers

10.4K Posts

Tsingggg retweeted

Zhaofeng Wu

@zhaofeng_wu

about 16 hours ago

It has been super fun working with @osieberling @tanshawn @rpanda89 Yury Polyanskiy and Yoon Kim! 📄 Paper: https://t.co/4XK3zHv6Rf

Tsingggg retweeted

Liquid AI

@liquidai

about 21 hours ago

Introducing LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: two multilingual retrieval models built for ultra-fast and accurate search across 11 languages. > End-to-end retrieval latency as low as 1.5ms with our enterprise stack! 🚀 > Consistently best-in-class multilingual and cross-lingual performance across Arabic, German, English, Spanish, French, Italian, Japanese, Korean, Norwegian, Portuguese, and Swedish. 🧵

liquidai's tweet photo. Introducing LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: two multilingual retrieval models built for ultra-fast and accurate search across 11 languages.

> End-to-end retrieval latency as low as 1.5ms with our enterprise stack! 🚀

> Consistently best-in-class multilingual and cross-lingual performance across Arabic, German, English, Spanish, French, Italian, Japanese, Korean, Norwegian, Portuguese, and Swedish.

🧵

799

125

405

71K

Tsingggg retweeted

Ryosuke Matsuda @VolumeisRyo

1 day ago

宣伝: こちらの7Bの拡散言語モデル開発に、自分も微力ながら取り組んでました！ Mask→Unmaskを一度だけではなく、何度もUnmaskできるUniformという形式での、世界初のオープンUniform Diffusion Language Modelになってます！機会があったら、遊んでみてください🙌 🔗 https://t.co/w9ZPKjh1V9

VolumeisRyo's tweet photo. 宣伝: こちらの7Bの拡散言語モデル開発に、自分も微力ながら取り組んでました！

Mask→Unmaskを一度だけではなく、何度もUnmaskできるUniformという形式での、世界初のオープンUniform Diffusion Language Modelになってます！

機会があったら、遊んでみてください🙌

🔗 https://t.co/w9ZPKjh1V9 https://t.co/D9dDYf8GnC

14K

Tsingggg retweeted

Google Cloud Tech

@GoogleCloudTech

3 days ago

Introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format. AI is only as smart as the context we give it. As we build more advanced, agentic AI systems, they need accurate metadata and context to be useful. But in most organizations, that context is locked inside fragmented data catalogs, isolated wikis, scattered code comments, or the minds of senior engineers. Every time a new AI agent is built, teams are forced to solve the exact same context-assembly problem from scratch. To solve this, we've announced OKF, a vendor-neutral, open specification that formalizes the "LLM-wiki pattern" into a portable, interoperable format. It provides a standardized way to represent the enterprise knowledge that modern AI systems rely on. — Just markdown: readable in any editor, renderable on GitHub, indexable by any search tool — Just files: shippable as a tarball, hostable in any git repo, mountable on any filesystem — Just YAML frontmatter: for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp We’ve also shipped reference implementations to help you hit the ground running, including an enrichment agent for BigQuery, a static HTML visualizer, and live sample bundles on @github → https://t.co/ilhAMCrcTc ➕ Knowledge Catalog can now natively ingest OKF! Stop reinventing data models and building bespoke integrations for every new AI tool. Here's more about how OKF works → https://t.co/FR4kJRsgEH

GoogleCloudTech's tweet photo. Introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format.

AI is only as smart as the context we give it. As we build more advanced, agentic AI systems, they need accurate metadata and context to be useful. But in most organizations, that context is locked inside fragmented data catalogs, isolated wikis, scattered code comments, or the minds of senior engineers. Every time a new AI agent is built, teams are forced to solve the exact same context-assembly problem from scratch.

To solve this, we've announced OKF, a vendor-neutral, open specification that formalizes the "LLM-wiki pattern" into a portable, interoperable format. It provides a standardized way to represent the enterprise knowledge that modern AI systems rely on.

— Just markdown: readable in any editor, renderable on GitHub, indexable by any search tool
— Just files: shippable as a tarball, hostable in any git repo, mountable on any filesystem
— Just YAML frontmatter: for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp

We’ve also shipped reference implementations to help you hit the ground running, including an enrichment agent for BigQuery, a static HTML visualizer, and live sample bundles on @github → https://t.co/ilhAMCrcTc

➕ Knowledge Catalog can now natively ingest OKF!

Stop reinventing data models and building bespoke integrations for every new AI tool. Here's more about how OKF works → https://t.co/FR4kJRsgEH

128

596

433K

Who to follow

Marco Lee

@marcoleewow

Ex AI Engineering Manager @GoodNotesApp

Antonio J. Dominguez

@antferdom

Efficient AI @verdacloud, ML sys. Inference. Programming language theory. early LLMs @unisevilla

ayameRushia

@ayameRushia

https://t.co/8mIh2cGTXn

Tsingggg retweeted

Ant Ling

@AntLingAGI

3 days ago

Ling & Ring 2.6 technical report is out, with two open-weight base models. We co-design model + system across architecture, training, and agentic capability: • 7:1 hybrid linear attention • KPop for stable agentic RL: SWE-bench Verified 76.28% • ~4× token efficiency

AntLingAGI's tweet photo. Ling & Ring 2.6 technical report is out, with two open-weight base models.

We co-design model + system across architecture, training, and agentic capability:
• 7:1 hybrid linear attention
• KPop for stable agentic RL: SWE-bench Verified 76.28%
• ~4× token efficiency https://t.co/vMuUEOYXi9

513K

Tsingggg retweeted

OpenAI

@OpenAI

3 days ago

We’re sharing new research on a method for anticipating how models may behave in real-world use before release: simulating deployment with recent, de-identified user requests and studying candidate model responses. https://t.co/7RJzBfNniQ

220

239

778

322K

Tsingggg retweeted

OpenAI

@OpenAI

3 days ago

Let’s talk about evals. We’re always looking for better ways to measure and forecast model progress, especially as benchmarks get saturated or gamed. @tejalpatwardhan, who leads our frontier evals team, spoke to @andrewmayne about why evals matter and what models need to be judged on next.

137

130

259K

Tsingggg retweeted

Qwen

@Alibaba_Qwen

3 days ago

📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100+ hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20+ embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: https://t.co/ytLcbYET26 📖 Report： Qwen-RobotNav: https://t.co/uPmSwDYGxg Qwen-RobotManip: https://t.co/GeyIzJSpU8 Qwen-RobotWorld： https://t.co/SXPH1qzDFy

Alibaba_Qwen's tweet photo. 📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence.

🧭 Qwen-RobotNav — the gateway to mobility.
• Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving
• Controllable observation protocol
• Tool interface for agentic systems

🤖 Qwen-RobotManip — the foundation of interaction.
• Unified state-action space across heterogeneous robots
• Camera-frame delta poses for coherent cross-embodiment training
• Pretrained on a 38,100+ hour open-source corpus

🌍 Qwen-RobotWorld — infinite worlds for physical agents.
• Single world model, 20+ embodiments
• Natural-language action interface
• Predicts physically grounded futures across manipulation, driving, and navigation

Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it.

📷 Blog:
https://t.co/ytLcbYET26
📖 Report：
Qwen-RobotNav: https://t.co/uPmSwDYGxg
Qwen-RobotManip: https://t.co/GeyIzJSpU8
Qwen-RobotWorld： https://t.co/SXPH1qzDFy

424

302K

Tsingggg retweeted

Alexander Whedon

@alex_whedon

3 days ago

Here is the technical report on SubQ 1.1 Small. https://t.co/bu8AEc4lsk This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks. The results are compelling and verified by @AppenResearch. - Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction. - A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks. - At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2. These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture. We included some details and learnings from the development process which may be helpful to the community. Comment with questions, I’ll try to respond!

alex_whedon's tweet photo. Here is the technical report on SubQ 1.1 Small.
https://t.co/bu8AEc4lsk

This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks.

The results are compelling and verified by @AppenResearch.

- Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction.

- A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks.

- At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2.

These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture.

We included some details and learnings from the development process which may be helpful to the community.

Comment with questions, I’ll try to respond!

518

269

132K

Tsingggg retweeted

WeiboLLM @WeiboLLM

3 days ago

⭐ VibeThinker-3B is released — a dense 3B model for frontier-level verifiable reasoning. 🚀 Reasoning: 94.3 on AIME’26, 76.4 on IMO-AnsBench, and 80.2 Pass@1 on LCB v6; with CLR, AIME‘26 improves to 97.1 and IMO-AnsBench to 80.6. 💻 OOD Coding: On recent unseen LeetCode weekly contests, VibeThinker-3B passes 123/128 (96.1%) first-attempt Python submissions. ⚡ Efficiency: Only 3B parameters, yet reaching the performance range of much larger top-tier reasoning models. 🧠 Perspective: Small models are not just cheaper substitutes. In parameter-dense domains with clear verification signals, SLMs offer a path to frontier-level reasoning that complements traditional Scaling Law. Model : https://t.co/94A14zpqCV Github: https://t.co/32so5P6C7L Paper: https://t.co/UDd264RsZb #AI #LLM #Reasoning #OpenSource #SmallModel

WeiboLLM's tweet photo. ⭐ VibeThinker-3B is released — a dense 3B model for frontier-level verifiable reasoning.

🚀 Reasoning: 94.3 on AIME’26, 76.4 on IMO-AnsBench, and 80.2 Pass@1 on LCB v6; with CLR, AIME‘26 improves to 97.1 and IMO-AnsBench to 80.6.

💻 OOD Coding: On recent unseen LeetCode weekly contests, VibeThinker-3B passes 123/128 (96.1%) first-attempt Python submissions.

⚡ Efficiency: Only 3B parameters, yet reaching the performance range of much larger top-tier reasoning models.

🧠 Perspective: Small models are not just cheaper substitutes. In parameter-dense domains with clear verification signals, SLMs offer a path to frontier-level reasoning that complements traditional Scaling Law.

Model : https://t.co/94A14zpqCV
Github: https://t.co/32so5P6C7L
Paper: https://t.co/UDd264RsZb

#AI #LLM #Reasoning #OpenSource #SmallModel

149

886

93K

Tsingggg retweeted

Z.ai @Zai_org

3 days ago

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1 Tech Blog: https://t.co/LAsxUdN0JZ Weights: https://t.co/g0A1C4UWx4 API: https://t.co/Kc3E22cbN7 Coding Plan: https://t.co/Nk8Y98HNhU Chat: https://t.co/WCqWT0qCQb

Zai_org's tweet photo. Introducing GLM-5.2: Frontier Intelligence, Open Weights

- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1

Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb

595

11K

Tsingggg retweeted

LMSYS Org

@lmsysorg

4 days ago

🚀 New blog: The next generation of speculative decoding: DFlash and Spec V2 DFlash + Spec V2 hit >4.3X baseline throughput for LLM inference, now the default speculative decoding engine in SGLang! Together with @modal and https://t.co/ZXetBKIRym, our jointly-released DFlash drafter for Qwen 3.5 397B-A17B beats both baseline and native MTP in every setting we benchmarked: 1️⃣ >4.3X baseline & 1.5X native MTP throughput (concurrency 1, HumanEval, 8xB200) 2️⃣ Block diffusion drafter: a full token block in one forward pass 3️⃣ KV injection: target-model features fed into every draft layer’s KV cache for higher acceptance 4️⃣ Spec V2 overlap scheduler: +33% end-to-end Read the code, deploy a DFlash server, and start experimenting!

lmsysorg's tweet photo. 🚀 New blog: The next generation of speculative decoding: DFlash and Spec V2

DFlash + Spec V2 hit >4.3X baseline throughput for LLM inference, now the default speculative decoding engine in SGLang! Together with @modal and https://t.co/ZXetBKIRym, our jointly-released DFlash drafter for Qwen 3.5 397B-A17B beats both baseline and native MTP in every setting we benchmarked:
1️⃣ >4.3X baseline & 1.5X native MTP throughput (concurrency 1, HumanEval, 8xB200)
2️⃣ Block diffusion drafter: a full token block in one forward pass
3️⃣ KV injection: target-model features fed into every draft layer’s KV cache for higher acceptance
4️⃣ Spec V2 overlap scheduler: +33% end-to-end

Read the code, deploy a DFlash server, and start experimenting!

443

273

120K

Tsingggg retweeted

Tri Dao

@tri_dao

4 days ago

As hybrid models (Qwen 3.5 / Nemotron Ultra) run agents with massive context, Gated-DeltaNet / Mamba states become a bottleneck. A simple insight to make this 2x faster: load the states, compute, but don't store them. This recompute trick finally unlocks spec decoding for SSMs

355

183

36K

Tsingggg retweeted

Licheng Liu

@liulicheng10

4 days ago

probably the best blog i have read for some time viewing SFT, RL, and OPD as different ways of reshaping a model's distribution makes their tradeoffs super intuitive. - SFT pulls toward a fixed external target - RL moves along the reward gradient on on-policy samples - OPD sits in between, using a teacher signal but on student-generated data, which is why it inherits RL's anti-forgetting properties even when the teacher itself was an overtrained SFT model. the post is heavily grounded in recent literature and uses the distributional perspective as a unifying bridge across all three paradigms, i really like the point it argues the load-bearing ingredient is on-policy data and OPD's convergence to RL-like outcomes is the strongest evidence

liulicheng10's tweet photo. probably the best blog i have read for some time

viewing SFT, RL, and OPD as different ways of reshaping a model's distribution makes their tradeoffs super intuitive.

- SFT pulls toward a fixed external target
- RL moves along the reward gradient on on-policy samples
- OPD sits in between, using a teacher signal but on student-generated data, which is why it inherits RL's anti-forgetting properties even when the teacher itself was an overtrained SFT model.

the post is heavily grounded in recent literature and uses the distributional perspective as a unifying bridge across all three paradigms, i really like the point it argues the load-bearing ingredient is on-policy data and OPD's convergence to RL-like outcomes is the strongest evidence

214

95K

Tsingggg retweeted

Florian Brand

@xeophon

4 days ago

good stuff from microsoft: 4B model just to explore code bases, cutting token costs by 10-50% (!!!) while the performance of the big model stays the same :)

xeophon's tweet photo. good stuff from microsoft: 4B model just to explore code bases, cutting token costs by 10-50% (!!!) while the performance of the big model stays the same :) https://t.co/qw0ttGBWJF

952

61K

Tsingggg retweeted

Stephanie Chan @scychan_brains

5 days ago

"From AGI to ASI": new paper from our team. This report investigates how AI might develop beyond AGI. It describes theoretical limits, potential pathways, and potential bottlenecks. https://t.co/x0ZEV2xhNw

646

101

552

63K

Tsingggg retweeted

Photoroom

@photoroom_ML

7 days ago

🚀 Meet PRX Pixel. Our new open-source 7B text-to-image model that generates images directly in pixel space. After months of pretraining on hundreds of millions of images, supervised fine-tuning, and preference alignment, we're excited to share a first public preview. The weights are already available, and we're currently working on integrating the model directly into Diffusers 🤗to make the model even easier to use. Test it yourself in the demo below. And as always, we'll be sharing the full story behind the model through a series of technical blog posts covering the entire training recipe. Link in the comments 👇

photoroom_ML's tweet photo. 🚀 Meet PRX Pixel.
Our new open-source 7B text-to-image model that generates images directly in pixel space.
After months of pretraining on hundreds of millions of images, supervised fine-tuning, and preference alignment, we're excited to share a first public preview.
The weights are already available, and we're currently working on integrating the model directly into Diffusers 🤗to make the model even easier to use.
Test it yourself in the demo below. And as always, we'll be sharing the full story behind the model through a series of technical blog posts covering the entire training recipe.
Link in the comments 👇

285

208

57K

Tsingggg retweeted

Zyphra

@ZyphraAI

7 days ago

Today we're releasing ZONOS2, our next-generation real-time TTS model with high-fidelity voice cloning. ZONOS2 is the most expressive open-source TTS model, released under Apache 2.0 and available on Zyphra Cloud on @AMD. 🧵

697

108

674

330K

Tsingggg retweeted

Anthropic

@AnthropicAI

6 days ago

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: https://t.co/bwn0sximKZ

13K

88K

26K

24K

91M

Tsingggg retweeted

Yuzhen Mao

@Mao_Yuzhen

9 days ago

What happens when multi-agent systems stop relying on a central “controller” agent? Can agents coordinate by sharing results directly with each other? Introducing Decentralized Language Models (DeLM): we let agents coordinate asynchronously through a shared context. Agents claim tasks from a queue and write back compact, verified results as they finish, making progress visible to all workers without requiring a main agent to merge, filter, and rebroadcast it. New paper with @azaliamirh!

295

288

84K

Tsing

@Tsingggg

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users