Hermes Agent is now #1 on the Global @OpenRouter token rankings.
While our journey together has just begun, we'd like to take this opportunity to thank our contributors, supporters, and users for all they have done to get us this far.
3.1 Pro can generate website-ready, animated SVGs from a simple text prompt. Since these are built in pure code and not pixels, they stay crisp at any scale with incredibly small file sizes. Check out the difference:
Today, we’re making big upgrades to the performance and availability of our most popular Gemini features, and adding a new feature that will make Gemini even more personal and helpful: https://t.co/kc685gizg6
Let’s break down today’s updates 🧵⬇️
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length
Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more.
Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today!
There’s lots of material about this, some of which are linked to below.
Main blog post:
https://t.co/QAsDKXBdao
Technical report:
“Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context”
https://t.co/CTzTHNDCdo
Videos of interactions with the model that highlight its long context abilities:
Understanding the three.js codebase: https://t.co/yq7d6OSD6c
Analyzing a 45 minute Buster Keaton movie: https://t.co/adyMgDYHoK
Apollo 11 transcript interaction: https://t.co/Pqvq3Eac1R
Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs:
Google for Developers blog:
https://t.co/x73Vun0kVS
Google Cloud blog:
https://t.co/OlaTW6PYGn
We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model.
Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window.
Let me walk you through the capabilities of the model and what I’m excited about!
As you know, my explorations of the Gen AI space is ultimately all about creative control. You should be able to shape the generative matter using all your artistic sensibilities and your aesthetic sense.
OpenAI's Sora is a huge technological leap, but what excites me the most about it is the modalities where it depends on input other than text alone. Such as video to video. Here's an example of how Sora can change an input video.
Base video🧵
Google Deep Mind researchers just dropped a HUGE advancement in robotics 🤖
It is truly mind-blowing.
A "robot in every household" is closer than we think.
Here's everything you need to know about the Mobile Aloha robot (& more demo videos):
👇