This release represents a huge team effort. So proud of this team! It's a joy to work alongside such brilliant people, and I can't wait to see how the research community uses @TranslateGemma in the future!
#AI#Gemma3
🗣 Introducing TranslateGemma, our new collection of open translation models built on Gemma 3.
The model is available in 4B, 12B, and 27B parameter sizes, and furthers communication across languages, no matter what device you own. https://t.co/dniQD3RPKP
🎓Completed a PhD on Machine Translation, multilingual, or cross-lingual NLP?
Nominate a standout thesis (or your own!) for the AMTA Best Thesis Award. Winner gets $1000, an invited talk at #AMTA2026 & more.
🗓️ Deadline: 15 May 2026
ℹ️Info: https://t.co/Aa8qlSDfK6
#MT#NLP#PhD
Every month, people use Google to translate around 1 trillion words. Today, we’re introducing a new AI-powered live translation experience in the Google Translate app, plus a new beta feature to help you practice new languages. Rolling out now on iOS + Android.
Our Google Translate team is bringing a strong presence to #ACL2025 in Vienna this week! 🇦🇹 My group is excited to present several of our latest papers. 👇 Don't miss them!
MT metrics excel at evaluating sentence translations, but struggle with complex texts
We introduce *TREQA* a framework to assess how translations preserve key info by using LLMs to generate & answer questions about them
https://t.co/aHUScXzoBM
(co-lead @swetaagrawal20)
1/15
😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: https://t.co/HISmFuKe8I
Huggingface: https://t.co/TPCFw01rh0
🚨New machine translation dataset alert! 🚨We expanded the language coverage of WMT24 from 9 to 55 en->xx language pairs by collecting new reference translations for 46 languages in a dataset called WMT24++
Paper: https://t.co/owplgurKHP
Data: https://t.co/ODxHUEq5Xl
😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: https://t.co/HISmFuKe8I
Huggingface: https://t.co/TPCFw01rh0
Two new datasets from Google Translate targeting high and low resource languages!
WMT24++: 46 new en->xx languages to WMT24, bringing the total to 55
SMOL: 6M tokens for 115 very low-resource languages
WMT24++: https://t.co/eDU1htGhZt
SMOL: https://t.co/y2xQWOXi5W
🚨New machine translation dataset alert! 🚨We expanded the language coverage of WMT24 from 9 to 55 en->xx language pairs by collecting new reference translations for 46 languages in a dataset called WMT24++
Paper: https://t.co/owplgurKHP
Data: https://t.co/ODxHUEq5Xl
Highly recommend everyone to read Yusuf's work on data contanimation! He trained almost 90 models investigating how different types of contanimation affect test set scores! Very insigthful paper.
Thrilled to share our latest findings on data contamination, from my internship at @Google! We trained almost 90 Models on 1B and 8B scales with various contamination types using machine translation as our task and analyze the impact of contamination.
https://t.co/4AjY5jSgX8