Pierre Colombo @PierreColombo6 - Twitter Profile

Pinned Tweet

almost 2 years ago

🚀 Introducing SaulLM-141B and SaulLM-54B: The First Open Family of Legal Models. After #SaulLM-7B the family is growing! We are proud to unveil the latest innovations from our team: the SaulLM-141B and 54B generative AI models, specifically designed for the legal domain.

1

10

2

9

1K

Pierre Colombo @PierreColombo6

8 days ago

@nickfrosst @leftylabourtech 😂

0

5

0

1K

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

8 days ago

Awesome to see them build on the same tricks as BidirLM-Omni, like the decoder-to-encoder adaptation and cross-modality model merging. The synthetic data ablation is also a huge highlight with a +15 gain for code retrieval domains

N1colAIs's tweet photo. Awesome to see them build on the same tricks as BidirLM-Omni, like the decoder-to-encoder adaptation and cross-modality model merging. The synthetic data ablation is also a huge highlight with a +15 gain for code retrieval domains https://t.co/sMbqYpcfnP

1

8

4

1

699

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

8 days ago

Gemini Embedding 2 is out, and it's completely omnimodal 😎 Great to see the next chapter of encoders heading toward text, audio, and visual unlocking so many use cases https://t.co/t1d3HqRcdl

2

100

18

51

6K

Who to follow

Stephanie Chan

@scychan_brains

Staff Research Scientist at DeepMind. Artificial & biological brains 🤖 🧠 Societal impacts of AI + Science of AI. Views are my own.

Pasquale Minervini

@PMinervini

Research in ML/NLP at @EdinburghNLP (tenured faculty at @EdinburghUni), Co-Founder @Miniml_AI, @ELLISforEurope Scholar, https://t.co/5dUI3EFMmW

Max Bartolo

@max_nlp

Researcher @GoogleDeepMind & co-chair @DynabenchAI @MLCommons. Previously @Cohere, @MetaAI/FAIR & @BloomsburyAI.

PierreColombo6 retweeted

NVIDIA AI Infrastructure

@NVIDIAAIInfra

15 days ago

👏 Congratulations to @cohere on Command A+ — a powerful new model optimized for NVIDIA Blackwell and trained using NVIDIA CUDA-X libraries. Proud to be a part of it! Learn more ⤵️

7

308

30

33

38K

PierreColombo6 retweeted

Cohere

@cohere

15 days ago

Releasing open-source under the Apache 2.0 license. We want to give developers direct access to enterprise-grade agentic capabilities from experimentation to production. Sovereign AI. For all. Download Command A+: https://t.co/USXpmpid01 Or learn more: https://t.co/mXb3WLHN85

11

301

31

147

32K

PierreColombo6 retweeted

Aidan Gomez

@aidangomez

15 days ago

Our first fully open source Apache 2 model :)

14

215

16

18

17K

PierreColombo6 retweeted

Artificial Analysis

@ArtificialAnlys

15 days ago

Cohere launches open weights model Command A+ that achieves 37 on the Artificial Analysis Intelligence Index The release of Command A+ places @Cohere in line with Claude 4.5 Haiku on the Intelligence Index, and just above NVIDIA Nemotron 3 Super and Gemini 3.1 Flash-Lite. Key Takeaways: ➤ Command A+ ranks first on AA-Omniscience Non-Hallucination at 86%, ~3 percentage points ahead of the next-best model. Its AA-Omniscience Accuracy is 9%, so the headline AA-Omniscience score lands at -4, demonstrating a similar archetype to Claude 4.5 Haiku, where the model knows its limits ➤ On Cohere’s API, Command A+ (~281 output tokens per second) is faster than several comparable open-weights and small to mid-sized proprietary models (e.g., GPT-5.4 nano, Claude 4.5 Haiku, and Grok 4.3), but still slower than Gemini 3.1 Flash-Lite Preview, which outputs 304 tokens per second ➤ Command A+ trails its peer set on scientific reasoning (HLE ~11%, GPQA Diamond ~76%) and on coding (Terminal-Bench Hard ~25%, SciCode ~38%), consistent with gaps on the hardest science and agentic coding benchmarks ➤ It supports visual reasoning and scores 63% on MMMU-Pro (between Claude 4.5 Haiku at 59% and GPT-5.4 nano (xhigh) at 65%)

ArtificialAnlys's tweet photo. Cohere launches open weights model Command A+ that achieves 37 on the Artificial Analysis Intelligence Index

The release of Command A+ places @Cohere in line with Claude 4.5 Haiku on the Intelligence Index, and just above NVIDIA Nemotron 3 Super and Gemini 3.1 Flash-Lite.

Key Takeaways:

➤ Command A+ ranks first on AA-Omniscience Non-Hallucination at 86%, ~3 percentage points ahead of the next-best model. Its AA-Omniscience Accuracy is 9%, so the headline AA-Omniscience score lands at -4, demonstrating a similar archetype to Claude 4.5 Haiku, where the model knows its limits

➤ On Cohere’s API, Command A+ (~281 output tokens per second) is faster than several comparable open-weights and small to mid-sized proprietary models (e.g., GPT-5.4 nano, Claude 4.5 Haiku, and Grok 4.3), but still slower than Gemini 3.1 Flash-Lite Preview, which outputs 304 tokens per second

➤ Command A+ trails its peer set on scientific reasoning (HLE ~11%, GPQA Diamond ~76%) and on coding (Terminal-Bench Hard ~25%, SciCode ~38%), consistent with gaps on the hardest science and agentic coding benchmarks

➤ It supports visual reasoning and scores 63% on MMMU-Pro (between Claude 4.5 Haiku at 59% and GPT-5.4 nano (xhigh) at 65%)

13

253

25

46

35K

PierreColombo6 retweeted

Cohere

@cohere

15 days ago

Introducing: Cohere Command A+ We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.

103

3K

380

2K

731K

PierreColombo6 retweeted

Manuel Faysse

@ManuelFaysse

20 days ago

🚨 Do LLMs need to store everything they read in memory? To reduce KV cache size and improve decoding speeds, we propose Self-Pruned KV attention, a mechanism where the model learns to decide which KVs to write in the persistent KV cache, discarding all the rest! @AIatMeta🧵

ManuelFaysse's tweet photo. 🚨 Do LLMs need to store everything they read in memory?
To reduce KV cache size and improve decoding speeds, we propose Self-Pruned KV attention, a mechanism where the model learns to decide which KVs to write in the persistent KV cache, discarding all the rest! @AIatMeta🧵 https://t.co/5UeHSpusGo

8

203

45

149

21K

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

24 days ago

@JinaAI_ has hopped on the omnimodal train🚂 They just dropped a collection of two Omni embedding models (0.9B & 2B). Similar to BidirLM, they seem to rely on the Qwen modality head for the larger one, while sticking with EuroBERT for the nano version 🥰 https://t.co/A8BQma6Zpn

1

16

2

2K

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

about 1 month ago

BidirLM-Omni is on MTEB and Sentence-Transformer! https://t.co/JRqmipX5xl 🥇#1 Open-Source Model on MTEB (#15 overall) 🖼️#1 across all sizes on MIEB (Image) 🎧#1 sub-7B model on MAEB (Audio, #2 overall) Small size, massive performance, Fully open Model: https://t.co/AZzOJ6ZhhN

N1colAIs's tweet photo. BidirLM-Omni is on MTEB and Sentence-Transformer!
https://t.co/JRqmipX5xl

🥇#1 Open-Source Model on MTEB (#15 overall)
🖼️#1 across all sizes on MIEB (Image)
🎧#1 sub-7B model on MAEB (Audio, #2 overall)

Small size, massive performance, Fully open
Model: https://t.co/AZzOJ6ZhhN https://t.co/pvCc7jXi2r

2

29

7

10

2K

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

about 1 month ago

We are currently presenting 'Should We Still Pretrain Encoders with Masked Language Modeling?' Come see us in Hall 3 #1304 @iclr_conf https://t.co/kaPLch0Qen

N1colAIs's tweet photo. We are currently presenting 'Should We Still Pretrain Encoders with Masked Language Modeling?' Come see us in Hall 3 #1304 @iclr_conf

https://t.co/kaPLch0Qen https://t.co/HUVbrUygVO

1

69

7

30

9K

PierreColombo6 retweeted

Antoine Chaffin

@antoine_chaffin

about 1 month ago

If you're at ICLR, come say hi to @orionweller and @N1colAIs! And also, shot-out to all the French people pushing out the encoder architecture, it seems like, as for ColBERT, French taste is unmatched! (Non exhaustive list, pardon me but Twitter search is bad): @gisship @PierreColombo6 @ManuelFaysse @pteiletche @MaceQuent1 @mlpc123

0

8

1

441

Pierre Colombo @PierreColombo6

about 2 months ago

Great work from @gisship and @N1colAIs from @centralesupelec

DailyPapers

@HuggingPapers

about 2 months ago

BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost.

$HuggingPapers's tweet photo. BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost. https://t.co/j4lDR7dL0D$

7

248

29

164

15K

0

3

1

0

262

PierreColombo6 retweeted

DailyPapers

@HuggingPapers

about 2 months ago

BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost.

$HuggingPapers's tweet photo. BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost. https://t.co/j4lDR7dL0D$

7

248

29

164

15K

PierreColombo6 retweeted

Orion Weller @orionweller

about 2 months ago

Encoders are so much better for classification, why not use them for judging? Awesome study from @N1colAIs - cool to see a 210m BERT model beating much larger Qwen and Gemma models.

1

69

6

50

8K

Pierre Colombo @PierreColombo6

about 2 months ago

Evaluation is underrated. If your eval signal is noisy, you're flying blind. BERT-as-a-Judge gives you a fast, cheap way to improve your signal-to-noise ratio without spinning up a full LLM judge. Exactly the kind of infra work that compounds. @gisship @N1colAIs congrats!

Nicolas Boizard @N1colAIs

about 2 months ago

🎉 Second paper this month! Introducing BERT-as-a-Judge (x @gisship) ⚖️ Evaluating LLMs with rigid lexical methods often fails right answers due to bad formatting. While "LLM-as-a-Judge" solves this, it remains costly & slow. Our fix? A lightweight, encoder-driven approach.

N1colAIs's tweet photo. 🎉 Second paper this month! Introducing BERT-as-a-Judge (x @gisship) ⚖️

Evaluating LLMs with rigid lexical methods often fails right answers due to bad formatting. While "LLM-as-a-Judge" solves this, it remains costly & slow. Our fix? A lightweight, encoder-driven approach. https://t.co/Y6Mepde8GL

1

117

16

99

7K

0

4

0

3

470

PierreColombo6 retweeted

Nicolas Boizard @N1colAIs

about 2 months ago

🎉 Second paper this month! Introducing BERT-as-a-Judge (x @gisship) ⚖️ Evaluating LLMs with rigid lexical methods often fails right answers due to bad formatting. While "LLM-as-a-Judge" solves this, it remains costly & slow. Our fix? A lightweight, encoder-driven approach.

1

117

16

99

7K

PierreColombo6 retweeted

Niklas Muennighoff @Muennighoff

about 2 months ago

There's a wave of omni embedding models (gemini, nemotron, bidirlm). Excited to support this trend with our multimodal mteb versions (mieb, maeb) - video coming soon🎥

Muennighoff's tweet photo. There's a wave of omni embedding models (gemini, nemotron, bidirlm). Excited to support this trend with our multimodal mteb versions (mieb, maeb) - video coming soon🎥 https://t.co/Cxqdy9nj8q

1

62

13

17

10K

Pierre Colombo @PierreColombo6

about 2 months ago

Omni embeddings are becoming the new standard. Glad to see @N1colAIs @Muennighoff pushing multimodal eval forward with MIEB & MAEB — can't wait for the video!

Niklas Muennighoff @Muennighoff

about 2 months ago

There's a wave of omni embedding models (gemini, nemotron, bidirlm). Excited to support this trend with our multimodal mteb versions (mieb, maeb) - video coming soon🎥

1

62

13

17

10K

0

5

2

0

648

Pierre Colombo

@PierreColombo6

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users