Aliaksei Severyn

about 1 year ago

Gemini is unstoppable! More to come!

about 1 year ago

🚨Breaking: New Gemini-2.5-Pro (06-05) takes the #1 spot across all Arenas again! 🥇 #1 in Text, Vision, WebDev 🥇 #1 in Hard, Coding, Math, Creative, Multi-turn, Instruction Following, and Long Queries categories Huge congrats @GoogleDeepMind!

arena's tweet photo. 🚨Breaking: New Gemini-2.5-Pro (06-05) takes the #1 spot across all Arenas again!

🥇 #1 in Text, Vision, WebDev
🥇 #1 in Hard, Coding, Math, Creative, Multi-turn, Instruction Following, and Long Queries categories

Huge congrats @GoogleDeepMind! https://t.co/gYYjkuJsX4

21

1K

121

144

311K

0

4

1

0

2K

aseveryn retweeted

Jeff Dean

@JeffDean

about 1 year ago

Our best model*, Gemini 2.5 Pro, is now available for everyone in the Gemini app model drop down menu at https://t.co/xR9jN4XMCO. Give it a try with your most difficult questions! *For now! 😃

63

1K

112

109

178K

aseveryn retweeted

about 1 year ago

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer Query, and Multi-Turn! Massive congrats to @GoogleDeepMind for this incredible Arena milestone! 🙌 More highlights in thread👇

arena's tweet photo. BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆

Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer Query, and Multi-Turn!

Massive congrats to @GoogleDeepMind for this incredible Arena milestone! 🙌

More highlights in thread👇

71

2K

397

302

468K

aseveryn retweeted

Google Gemini

@GeminiApp

over 1 year ago

Starting today, Gemini Advanced users get priority access to our latest 2.0 Experimental Advanced model, Gemini-Exp-1206. This model is designed to help you with more complex tasks such as: 🧑‍💻 Advanced coding challenges 🧮 Solving math problems 🧠 Reasoning & instruction following Subscribe at https://t.co/QMUII87ebx to preview Gemini-Exp-1206. (You can try Gemini Advanced for one month, at no charge!)

134

1K

194

191

283K

over 1 year ago

Our new Gemini model is top across all dimensions in LMSYS! Huge congrats to the team! Soon in prod.

over 1 year ago

Gemini-Exp-1206 tops all the leaderboards, with substantial improvements in coding and hard prompts. Try it at https://t.co/gxIFU9kIc2 !

arena's tweet photo. Gemini-Exp-1206 tops all the leaderboards, with substantial improvements in coding and hard prompts. Try it at https://t.co/gxIFU9kIc2 ! https://t.co/GE68XZofjp

1

209

25

64K

0

8

0

380

aseveryn retweeted

over 1 year ago

Gemini-Exp-1206 tops all the leaderboards, with substantial improvements in coding and hard prompts. Try it at https://t.co/gxIFU9kIc2 !

1

209

25

64K

aseveryn retweeted

over 1 year ago

Massive News from Chatbot Arena🔥 @GoogleDeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision leaderboard. Gemini-Exp-1114 excels across technical and creative domains: - Overall #3 -> #1 - Math: #3 -> #1 - Hard Prompts: #4 -> #1 - Creative Writing #2 -> #1 - Vision: #2 -> #1 - Coding: #5 -> #3 - Overall (StyleCtrl): #4 -> #4 Huge congrats to @GoogleDeepMind on this remarkable milestone! Come try the new Gemini and share your feedback!

arena's tweet photo. Massive News from Chatbot Arena🔥

@GoogleDeepMind's latest Gemini (Exp 1114), tested with 6K+ community votes over the past week, now ranks joint #1 overall with an impressive 40+ score leap — matching 4o-latest in and surpassing o1-preview! It also claims #1 on Vision leaderboard.

Gemini-Exp-1114 excels across technical and creative domains:

- Overall #3 -> #1
- Math: #3 -> #1
- Hard Prompts: #4 -> #1
- Creative Writing #2 -> #1
- Vision: #2 -> #1
- Coding: #5 -> #3
- Overall (StyleCtrl): #4 -> #4

Huge congrats to @GoogleDeepMind on this remarkable milestone!

Come try the new Gemini and share your feedback!

58

2K

299

329

730K

Google DeepMind @GoogleDeepMind

almost 2 years ago

Very excited about the Gemma 2.0 (27B) release. It is the best open weights model according to the LMSYS leaderboard ranking higher than Llama3-70b.

aseveryn's tweet photo. Very excited about the Gemma 2.0 (27B) release. It is the best open weights model according to the LMSYS leaderboard ranking higher than Llama3-70b. https://t.co/XUmuENOgg4

almost 2 years ago

We're excited to unveil Gemma 2. 🛠️ Available in both 9B and 27B parameters, it delivers the best performance for its size - unlocking more possibilities for developers to build and deploy with AI. → https://t.co/RmRLEdFPMV

31

983

205

163

470K

0

16

3

0

2K

aseveryn retweeted

Clément Farabet

@clmt

almost 2 years ago

Gemma 2 is out! As with our first model, we're super focused on creating models at useful, practical sizes, so that they can be easily deployable... all the while being amazing in quality. We upgraded our 9B so that it's truly awesome and best in class across many benchmarks. And we're introducing a brand new 27B, also best at size, and actually stronger than some larger models. Both did real nice on LMSYS. The 27B Gemma 2 model is designed to run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. And of course, this is our open weights model line... enjoy! https://t.co/TmgaJH52Zi - try it in AI Studio https://t.co/ypeIKONwSC More in the tech report => https://t.co/2wnb6dIRWH

10

320

77

43

114K

aseveryn retweeted

Jeff Dean

@JeffDean

about 2 years ago

Gemini 1.5-Pro is a pretty good reward model (tops among generative models and second overall in the Reward Bench Leaderboard).

JeffDean's tweet photo. Gemini 1.5-Pro is a pretty good reward model (tops among generative models and second overall in the Reward Bench Leaderboard). https://t.co/tvlGdc6Y9M

15

263

34

63

71K

about 2 years ago

Gemini 1.5 Pro when zero-shot prompted to perform an LLM-as-a-judge task ranks 1st when compared to other Generative RMs and 2nd best overall vs other dedicated RMs: https://t.co/SpYoU9f49Q (make sure to click on the Generative checkbox).

0

62

11

24

75K

aseveryn retweeted

Jeff Dean

@JeffDean

over 2 years ago

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more. Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today! There’s lots of material about this, some of which are linked to below. Main blog post: https://t.co/QAsDKXBdao Technical report: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” https://t.co/CTzTHNDCdo Videos of interactions with the model that highlight its long context abilities: Understanding the three.js codebase: https://t.co/yq7d6OSD6c Analyzing a 45 minute Buster Keaton movie: https://t.co/adyMgDYHoK Apollo 11 transcript interaction: https://t.co/Pqvq3Eac1R Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs: Google for Developers blog: https://t.co/x73Vun0kVS Google Cloud blog: https://t.co/OlaTW6PYGn We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model. Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window. Let me walk you through the capabilities of the model and what I’m excited about!

JeffDean's tweet photo. Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length

Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more.

Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today!

There’s lots of material about this, some of which are linked to below.

Main blog post:
https://t.co/QAsDKXBdao

Technical report:
“Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context”
https://t.co/CTzTHNDCdo

Videos of interactions with the model that highlight its long context abilities:
Understanding the three.js codebase: https://t.co/yq7d6OSD6c
Analyzing a 45 minute Buster Keaton movie: https://t.co/adyMgDYHoK
Apollo 11 transcript interaction: https://t.co/Pqvq3Eac1R

Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs:
Google for Developers blog:
https://t.co/x73Vun0kVS
Google Cloud blog:
https://t.co/OlaTW6PYGn

We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model.

Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window.

Let me walk you through the capabilities of the model and what I’m excited about!

179

6K

1K

2K

2M