萌派(MPI)

This is THE moment of Physical AI! We are officially announcing Cosmos 3: Omnimodal World Models for Physical AI 🚀 - Cosmos 3 is an omnimodal world model: within a unified architecture, it can understand and generate language, images, video, audio, and actions. - It is not just a VLM, not just a video generator, not just an audio-visual generative model, and not just a physics simulator / world-action model. It can understand images and videos, generate images, videos, and audio, simulate future worlds, predict actions, and generate robot policies—enabling models to truly begin to “touch the world.” - Cosmos 3 is the #1 open-weight reasoner / T2I / I2V / robot policy across many benchmarks. Huge thanks to every teammate who fought side by side on this journey—from architecture, data, training, infra, serving, and evaluation to post-training. Every part of this project carries an incredible amount of hard work. This was my first time leading a project as Tech Lead, and I feel truly fortunate. The future of Physical AI needs models that can not only “see” and “describe” the world, but also “imagine,” “simulate,” and “act”—and eventually close the loop with the real world. I hope Cosmos 3 can become an important starting point for this direction, and I’m excited to push Physical AI into its next stage together with the open-source community. Welcome to the era of Physical AI. HuggingFace: https://t.co/QW5h5pIWWM Project Website: https://t.co/Jppa0gkn16 Code: https://t.co/aJgaLm5BaG

975

156

441

萌派(MPI) @ngunlp

4 days ago

隱翅蟲是如何折疊它的翅膀的？#shorts | #紀錄片 #有趣 #涨知识 #動物 #知識 #生物 https://t.co/ilYvnHJ5cw via @YouTube

萌派(MPI) @ngunlp

4 days ago

SpaceX IPO: Nice Try Though https://t.co/bs9Hf1kMpd via @YouTube

ngunlp retweeted

Mao Ning 毛宁

@SpoxCHN_MaoNing

7 days ago

Shenzhou‑21 return capsule’s parachute formed a little red hat right on the moon. Cute!

190

539

144K

萌派(MPI) @ngunlp

7 days ago

Why Trump Could Lose His Trade War With China | The Ezra Klein Show https://t.co/yVdMKIoKCI via @YouTube

ngunlp retweeted

DailyPapers

@HuggingPapers

8 days ago

NVIDIA just released an optimized version of the Kokoro TTS model on Hugging Face A lightweight 82M parameter speech synthesizer ready for commercial use, running fast on NVIDIA GPUs via ONNX Runtime. https://t.co/mhxM7fMAWL

460

431

27K

萌派(MPI) @ngunlp

9 days ago

@ray_su20 @seanwei001 没听过如此智障的想法

萌派(MPI) @ngunlp

9 days ago

就是速度太慢

ngunlp retweeted

Amto

@XAMTO_AI

10 days ago

音频领域又炸出一个开源猛货，MOSS-Audio 来了。 4B 和 8B 两种尺寸，每个都有 Instruct 和 Thinking 版，任你选。最硬核的是它把六项能力塞进了一个模型： 1️⃣ 语音识别（ASR） 2️⃣ 说话人分离——谁在说话，分得清清楚楚 3️⃣ 情绪识别——听得懂你是高兴还是烦躁 4️⃣ 环境音解析——雨声、车流、键盘声都能识别 5️⃣ 音乐理解——不只是识别歌名，是真的听懂结构 6️⃣ 带时间戳的 ASR——精确到每个字什么时候说的时间戳 ASR 这块，直接把 Gemini 2.5 Pro 甩开一大截，不是略胜，是碾压。以前做音频处理要拼一堆模型，现在一个全搞定，还开源。字幕、播客、客服质检、音乐标注，落地成本直接打下来了。 OpenMOSS 团队低调出手，行业震动。去 HuggingFace 直接拿。 🔗 https://t.co/Iy4LUYpvD2

XAMTO_AI's tweet photo. 音频领域又炸出一个开源猛货，MOSS-Audio 来了。

4B 和 8B 两种尺寸，每个都有 Instruct 和 Thinking 版，任你选。

最硬核的是它把六项能力塞进了一个模型：

1️⃣ 语音识别（ASR）
2️⃣ 说话人分离——谁在说话，分得清清楚楚
3️⃣ 情绪识别——听得懂你是高兴还是烦躁
4️⃣ 环境音解析——雨声、车流、键盘声都能识别
5️⃣ 音乐理解——不只是识别歌名，是真的听懂结构
6️⃣ 带时间戳的 ASR——精确到每个字什么时候说的

时间戳 ASR 这块，直接把 Gemini 2.5 Pro 甩开一大截，不是略胜，是碾压。

以前做音频处理要拼一堆模型，现在一个全搞定，还开源。字幕、播客、客服质检、音乐标注，落地成本直接打下来了。

OpenMOSS 团队低调出手，行业震动。

去 HuggingFace 直接拿。

🔗 https://t.co/Iy4LUYpvD2

751

128

44K

ngunlp retweeted

Alex Rives

@alexrives

10 days ago

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

446

708

592K

萌派(MPI) @ngunlp

11 days ago

华为提出韬定律！逻辑折叠能否让中国芯片「弯道超车」？ https://t.co/V54ebX4WHU via @YouTube

ngunlp retweeted

Tom Turney

@no_stp_on_snek

11 days ago

Interesting direction. The asymmetry is the important part here, not the headline compression ratio. K and V behave fundamentally differently under attention, and treating them identically leaves performance on the table. The memory caveat matters too: a compressed cache is only half the story if decode still needs a second working copy. If you store a tiny cache but also build dense scratch/reconstruction buffers to actually use it, the real RAM bill is closer to “compressed cache + working cache,” not just the advertised stored-cache size. That is where these systems get interesting: decode-time distribution shift, materialization overhead, transient workspace growth, and attention-weighted error behavior rather than raw reconstruction MSE. Related: * Asymmetric K/V Cache Compression: https://t.co/de3EjQuuzx * Sparse V Dequantization: https://t.co/gCCaCHTrKB * Why MSE Fails for KV Quantization: https://t.co/mFPCPkDjtS

萌派(MPI)

@ngunlp

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users