Introducing Gemma Scope 2
🤗Largest open release of interpretability tools (over 1 trillion parameters trained!)
🔬Works as a microscope to analyze all Gemma 3 models' internal activations
🗣️Advanced tools for analyzing chat behaviors
I pulled a random street from Google Maps.
Turned it into a 3D world with World Labs.
Dropped a Unitree G1 into it and hooked it up to our server-side MuJoCo setup.
Now the G1 is actually walking around a real Malaysian street in my browser.
Three.js is doing the rendering. MuJoCo is running the physics and walking policy on the server.
A simple WebSocket keeps everything locked together.
Real street → 3D world → server physics + policy → live control in the browser.
Kinda feels like cheating. Physical AI is getting wild.
Huge shoutout to @theworldlabs. Marble makes this stuff way too fun.
#robotics #MuJoCo #WorldLabs #simulation #unitree #sim2real
omg.. this cant be real
China’s 4DV AI just dropped 4D Gaussian Splatting, you can turn 2D video into 4D with sound..
imagine.. we will be able to change camera angle, zoom in/out while watching movies
5 examples:
Crazy to see how far AI video has come in the last few months
The handling of shadows and light here feels like a new standard 👇
(from @endlesstaverns, made with Runway Gen-3 + Suno)
3D Gaussian is great, but how can you interact with it 🌹👋? Introducing #PhysDreamer: Create your own realistic interactive 3D assets from only static images! Discover how we do this below👇 🧵1/:
Website: https://t.co/2tkKuyaaDx
Announcing 𝐕𝐨𝐢𝐜𝐞𝐂𝐫𝐚𝐟𝐭🪄
SotA for both speech editing and zero-shot text-to-speech, Outperforming VALL-E, XTTS-v2, etc.
VoiceCraft works on in-the-wild data such as movies, random videos and podcasts
We fully open source it at https://t.co/Fpqg9D4nUB
I am really excited to reveal what @GoogleDeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
In our new preprint, we ask: Do multilingual LLMs trained mostly on English use English as an “internal language”? - A key question for understanding how LLMs function.
“Do Llamas Work in English? On the Latent Language of Multilingual Transformers”
https://t.co/C9a907ByAL
Aya model is a new open-source massively multilingual language model.
It was instruction fine-tuned by people from all over the world through one year!
Aya is the unique model supporting 101 languages!
It's the next step in building truly multilingual models.
1/10
Great week for ML and multilingualism!
- French with CroissantLLM https://t.co/JqJX36RL8a
- Basque with Latxa https://t.co/xKpE3QBEsu
- Chinese with Qwen 1.5 https://t.co/TCH6TLGIxZ
- Dutch with GEITje https://t.co/0UuFIdve8z
With collaborators @Google we're announcing 💫 ZipLora 💫! Merging LoRAs has been a big thing in the community, but tuning can be an onerous process. ZipLora allows us to easily combine any subject LoRA with any style LoRA! Easy to reimplement 🥳
link: https://t.co/KzTgEI9e8a
TL;DR: We use SDS from text-to-video models to animate vector-graphics sketches!
Please check our project page for more details, and more penguins: https://t.co/orac3O2S65
Stop using text-to-image.
this blender + real-time latent consistency workflow is way more fun, and shows how you can use Generative AI collaboratively, instead of as a creative slot machine
try it yourself here https://t.co/QyiR5IDQI2
The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.