Benjamin Muller

@ben_mlr

Research in AI. Focusing on scaling language models multi-modally & multilingually @AIatMeta

NYC

Joined April 2016

2.1K Following

961 Followers

190 Posts

Pinned Tweet

Benjamin Muller @ben_mlr

over 1 year ago

So many exciting releases from FAIR @AIatMeta Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: Weights: https://t.co/FI5pPznwaE Code: https://t.co/GNgt3ATfUk Paper: https://t.co/CwmzjWz17Z (soon to be presented at #EMNLP2024 )

AI at Meta

@AIatMeta

over 1 year ago

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️ https://t.co/P3XkdN2WQN

619

118

205

150K

ben_mlr retweeted

Zhibin Gou @zebgou

6 months ago

If Gemini-3 proved continual scaling pretraining, DeepSeek-V3.2-Speciale proves scaling RL with large context. We spent a year pushing DeepSeek-V3 to its limits. The lesson is post-training bottlenecks are solved by refining methods and data, not just waiting for a better base.

161

335

237K

ben_mlr retweeted

Yen-Ju Lu @Yen_Ju_Lu

8 months ago

🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 https://t.co/4nUsbC1YKF

Yen_Ju_Lu's tweet photo. 🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️

Paper 📄 https://t.co/4nUsbC1YKF https://t.co/WsRkNc9B3S

10K

ben_mlr retweeted

Aymeric Zhuo @aymericzzz

9 months ago

Introducing @CodeWordsAI , the fastest way to go from idea to automation, simply by chatting with AI. No more drag-and-drop and configuration. Save time by doing less. Available today for free, for everyone. The Cursor moment for automation is here.

Who to follow

Center for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSQtw @[email protected]

Hanna Hajishirzi

@HannaHajishirzi

VP@Microsoft-AI; past: Olmo, Tulu

ben_mlr retweeted

Artidoro Pagnoni

@ArtidoroPagnoni

10 months ago

Thrilled to share that our Byte Latent Transformer won an Outstanding Paper Award at ACL 2025! 🏆

280

28K

ben_mlr retweeted

Gargi Ghosh @gargighosh

10 months ago

Outstanding paper award! @aclmeeting - BLT: https://t.co/QATnyus5Xb

131

11K

ben_mlr retweeted

Percy Liang

@percyliang

about 1 year ago

We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omni-MATH), but… https://t.co/uKMHRe7xKF

percyliang's tweet photo. We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omni-MATH), but…
https://t.co/uKMHRe7xKF https://t.co/TdGLXSMFXr

138

29K

ben_mlr retweeted

AI at Meta

@AIatMeta

about 1 year ago

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ https://t.co/9G3QgVdCkB Download Llama 4 ➡️ https://t.co/eVomRvEr0w

AIatMeta's tweet photo. Today is the start of a new era of natively multimodal AI innovation.

Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality.

Llama 4 Scout
• 17B-active-parameter model with 16 experts.
• Industry-leading context window of 10M tokens.
• Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks.

Llama 4 Maverick
• 17B-active-parameter model with 128 experts.
• Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image.
• Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks.
• Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters.
• Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena.

These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight.

Read more about the first Llama 4 models, including training and benchmarks ➡️ https://t.co/9G3QgVdCkB
Download Llama 4 ➡️ https://t.co/eVomRvEr0w

822

13K

ben_mlr retweeted

Jason Weston

@jaseweston

over 1 year ago

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: https://t.co/bRwq3d3wJq 🧵below

jaseweston's tweet photo. 🚨 Diverse Preference Optimization (DivPO) 🚨
SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data
🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality.
Paper 📝: https://t.co/bRwq3d3wJq
🧵below https://t.co/XQptWQEpCB

340

231

45K

ben_mlr retweeted

Gargi Ghosh @gargighosh

over 1 year ago

We released new research - Byte Latent Transformer(BLT) BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!

gargighosh's tweet photo. We released new research - Byte Latent Transformer(BLT)
BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich! https://t.co/Dsemi1lvYf

657

323

71K

ben_mlr retweeted

AI at Meta

@AIatMeta

over 1 year ago

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ https://t.co/0iamZCRnMN

AIatMeta's tweet photo. New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness.

Paper ➡️ https://t.co/0iamZCRnMN https://t.co/wjXVmDoiEJ

185

385

204K

Benjamin Muller @ben_mlr

over 1 year ago

Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture 🚀 More insights in the thread 🧵

Artidoro Pagnoni

@ArtidoroPagnoni

over 1 year ago

🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 https://t.co/5QGrlJdK0y Code 🛠️ https://t.co/jCdDI5BXwe

ArtidoroPagnoni's tweet photo. 🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯

Paper 📄 https://t.co/5QGrlJdK0y
Code 🛠️ https://t.co/jCdDI5BXwe https://t.co/7XyZdcXWoR

720

142

399

182K

Benjamin Muller @ben_mlr

over 1 year ago

Congrats @aymericzzz and team on being live! Very exciting vision to build entire softwares with just a prompt

Aymeric Zhuo @aymericzzz

over 1 year ago

Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider 𝗢𝘂𝗿 𝘃𝗶𝘀𝗶𝗼𝗻 𝗶𝘀 𝘁𝗼 𝗲𝗻𝗮𝗯𝗹𝗲 𝗮𝗻𝘆𝗼𝗻𝗲 𝘁𝗼 𝗰𝗿𝗲𝗮𝘁𝗲 𝘀𝗼𝗳𝘁𝘄𝗮𝗿𝗲 – from an idea to fully deployed software. The critical path to achieve it requires building AI systems that can reason about software at a fundamental level. 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗿𝗲 𝗻𝗼𝘁 𝘁𝗵𝗲 𝘀𝗼𝗹𝗲 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻, 𝗯𝘂𝘁 𝗮𝗿𝗲 𝗽𝗮𝗿𝘁 𝗼𝗳 𝘁𝗵𝗲 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻. Since inception, our research focus at agemo has been on leveraging neurosymbolic methods to build a reasoning system for software. With the first implementation and training of this system done, we have been iterating on a platform for non-coders to generate software. We are fortunate to be backed by @FlyVC and @firstminutecap alongside pioneers in the field at @GoogleDeepMind and @Meta . Read more about our story at https://t.co/hqDmgFPdvX

382

ben_mlr retweeted

Xiang Yue @xiangyue96

over 1 year ago

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! 🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨ https://t.co/lHP1CSNNVe https://t.co/RkMdE4JSQg The Pangea family includes three major components: 🔥 Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance. 📝 PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. 🗂️ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. 🎨 🏆 PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU). 🙌 This is a joint leading effort with @yueqi_song. Also kudos to the amazing team @AkariAsai, @seungonekim, @Jeande_d, @simi_97k, @anjali_ruban, @lintangsutawika, @Sathya8NR, @gneubig for their hard work! Check out more results and insights we conclude from our training in the thread below. 👇

xiangyue96's tweet photo. 🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries!

🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨

https://t.co/lHP1CSNNVe
https://t.co/RkMdE4JSQg

The Pangea family includes three major components:
🔥 Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance.

📝 PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. 🗂️ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. 🎨

🏆 PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU).

🙌 This is a joint leading effort with @yueqi_song. Also kudos to the amazing team @AkariAsai, @seungonekim, @Jeande_d, @simi_97k, @anjali_ruban, @lintangsutawika, @Sathya8NR, @gneubig for their hard work!

Check out more results and insights we conclude from our training in the thread below. 👇

371

208

91K

Benjamin Muller @ben_mlr

over 1 year ago

@mhnt1580 @juanmiguelpino Thanks Wei-Ning !!

118

ben_mlr retweeted

Yann LeCun

@ylecun

over 1 year ago

Meta Spirit LM: open source language model that mixes text and speech.

327

63K

Benjamin Muller @ben_mlr

over 1 year ago

@ruochenz_ Thank you Ruochen! Lots of research to do in that area indeed!

Benjamin Muller @ben_mlr

over 1 year ago

Thanks Tu :). Interesting! That's definitely relevant to our work, thanks for sharing! In our case we found that merging/swapping language-specific expert layers and English Math expert layers transfers effectively at test time to target languages. Doing it in a parameter-efficient setting was degrading the test performance in our setup but there is definitely more work to be done there.

203

Benjamin Muller @ben_mlr

over 1 year ago

Recent LLMs (e.g. LLama 3 🦙) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer Swapping** and find that combining Math and Language-Specific experts improves the performance of Llama 3 for specific languages (e.g. #Bengali) on Math queries Arxiv and detail in the thread below 🧵

Lucas Bandarkar @LucasBandarkar

over 1 year ago

Cross-lingual transfer can be as easy as swapping model layers between LLMs! 🔀 Our model merging method can compose math and language skills by swapping top&bottom layers from a SFT’d target language expert into a math expert without retraining https://t.co/IN5JPdTYU4 🧵: [1/3]

Benjamin Muller

@ben_mlr

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users