Grateful that @UNICEF's Bebbo Parenting Programme was recognized as a Champion at #WSIS2025 in Geneva! 🏆
When a child is born, a parent is born too. Every child deserves the best start in life, and every parent deserves support to make it possible. Bebbo is an open-source digital parenting programme supporting 1.5M parents across 17 countries 🌍 with evidence-based guidance from pregnancy through age 6.
Huge congrats to our incredible Central Bebbo team, Country Offices, and government partners! 💙
#DigitalParenting #EarlyChildhoodDevelopment #DigitalPublicGoods #UNICEF #Bebbo #wsis20
New Anthropic research: Why do some language models fake alignment while others don't?
Last year, we found a situation where Claude 3 Opus fakes alignment.
Now, we’ve done the same analysis for 25 frontier LLMs—and the story looks more complex.
Evaluations are essential to understanding how models perform in health settings.
HealthBench is a new evaluation benchmark, developed with input from 250+ physicians from around the world, now available in our GitHub repository.
https://t.co/s7tUTUu5d3
Today we're announcing Integrations, a new way to connect your apps and tools to Claude.
We're also expanding Claude's Research capabilities with an advanced mode that searches the web, your Google Workspace, and now your Integrations too.
Major updates from LlamaCon!
We’re advancing AI security with new open-source Llama protection tools and new AI- powered solutions for the defender community.
Developers can now access:
-- Llama Guard 4, a customizable safeguard that supports protections for text and image understanding across modalities.
-- Llama Firewall, a security guardrail tool that helps build secure AI systems by detecting and preventing risks like prompt injection, insecure code, and risky LLM plug-in interactions.
-- Two new versions of Llama Prompt Guard: Prompt Guard 2 86M, which improves performance in jailbreak and prompt injection detection, and Prompt Guard 2 22M, a smaller, faster version that reduces latency and compute costs with minimal performance trade-offs.
We’re also investing in new AI-enabled solutions to help the community enhance their security systems.
-- CyberSecEval 4 is our latest suite of cybersecurity benchmarks for AI systems.
-- The Llama Defender Program will help trusted partners access a variety of open, early-access, and closed AI-solutions to address different security needs.
Learn more about our new open-source protection tools and how we’re advancing AI privacy and security: ➡️ https://t.co/WXBijN3ajY
Claude can also now connect with your Gmail, Google Calendar, and Docs.
It understands your context and can pull information from exactly where you need it.
AI Mode is now available to millions more Labs users in the US 🚀 and we’re adding the power of Lens so you can easily search what you see. With AI Mode, you can…
✅ Ask your toughest questions and get an AI-powered response
✅ Ask any way you want, using text, voice, your camera or an image with Lens
✅ Explore more with follow-up questions and helpful web links
Read more on what’s new for AI Mode here ↓ https://t.co/32jBaGSEOR
Wow, this is absolutely amazing and super interesting! The insights on CoT faithfulness in reasoning models are eye-opening. Great work, @AnthropicAI
! #AI#MachineLearning
New Anthropic research: Do reasoning models accurately verbalize their reasoning?
Our new paper shows they don't.
This casts doubt on whether monitoring chains-of-thought (CoT) will be enough to reliably catch safety issues.
Today is the start of a new era of natively multimodal AI innovation.
Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality.
Llama 4 Scout
• 17B-active-parameter model with 16 experts.
• Industry-leading context window of 10M tokens.
• Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks.
Llama 4 Maverick
• 17B-active-parameter model with 128 experts.
• Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image.
• Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks.
• Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters.
• Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena.
These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight.
Read more about the first Llama 4 models, including training and benchmarks ➡️ https://t.co/9G3QgVdCkB
Download Llama 4 ➡️ https://t.co/eVomRvEr0w
Today we're sharing a major update to the Model Spec—a document which defines how we want our models to behave.
The update reinforces our commitments to customizability, transparency, and intellectual freedom to explore, debate, and create with AI. https://t.co/EPbqDp0Sdj
New Anthropic research: Evaluating feature steering.
In May, we released Golden Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature steering”. We've now done a deeper study on the effects of feature steering.
Read the post: https://t.co/2NTQfChhbZ
OpenAI's new o1 model is a BIG breakthrough in AI intelligence, if IQ tests say anything.
I gave it the Norway Mensa IQ test, and it blows other AIs out of the water.
I'm surprised!... Because there hadn't been public progress in the last 6mo.
Link to full analysis below:
Well done, Google! 🚀 Just explored Google's NotebookLM and its impressive new feature that transforms your research into a podcast-style conversation. I tested it with UNICEF's latest publication on parenting support framework (https://t.co/wJKI6DXnkF), and the results were remarkable. I highly recommend giving it a try!
#AI #NotebookLM #Google
Can OpenAI's o1 model revolutionize children's education? 🤔 Are we really ready to embrace this new age of thoughtful learning? 🚀
#OpenAIo1#artificialintelligence
Request urgent help - I bought Nothing phone 2a but it is not working in international roaming. 'Nothing' customer support says that this phone only works with India telecom providers. Very strange. This information was not mentioned anywhere in the description. Can you please help?
Bebbo Parenting App received the Digital Public Good status from @DPGAlliance!
UNICEF's Bebbo app was recognized for empowering families worldwide with high-quality parenting resources. Download the app and stay tuned for valuable content and support: https://t.co/7aSLD0bk97.
As a parent, you play an important role in supporting your child's mental health and well-being.
If you are angry at your child who has done something wrong, manage the feeling by finding appropriate ways to deal with emotions 👇
ℹ️ https://t.co/ueVS5Ikl2q