Voice agents are built as speech-to-speech (s2s) or as cascades of transcription, llm and synthesis. The former is fast, the later is smart.
@SakanaAILabs presents KAME: combining LLMs with s2s. Fast and smart!
https://t.co/lmkJLGWHUY
Curious about running your AI models on the @AxeleraAI Metis using GStreamer?
We’ve put together a starter guide to help you integrate your AI accelerator seamlessly into your applications with the power of GStreamer pipelines.
https://t.co/Yn4cfObVOU
The Metis M.2 lived up to the hype! We're excited to explore its full potential in upcoming projects. Kudos to the @AxeleraAI team for their constant releases and their quick support.
Great work from @ridgerunai getting the Metis M.2 AI accelerator running on Raspberry Pi 5 🔥
They break down setup, deployment with Voyager SDK & real-world benchmarks. Great guide if you’re into embedded AI, or looking for someone to help with your AI project/build 👇
🔗 https://t.co/Hl6gWHRH03
#EdgeAI #RaspberryPi #MetisAIPU @Raspberry_Pi
The result? 𝗝𝘂𝗻𝗶𝗽𝗲𝗿 outperforms GPT-4o in our function-calling tasks and other standard benchmarks, all while running efficiently on limited hardware. 🚀
LLMs excel at function calling, but they’re often too resource-intensive for edge devices. In our latest blog post we share how we fine-tuned a compact, local model: 𝗝𝘂𝗻𝗶𝗽𝗲𝗿, to perform reliable tool invocation.
https://t.co/c4N3RRyedy
🚀 Automate 2025 is here! 🤖✨
We’re excited to be part of the Automate Show this week! Come meet the https://t.co/8kgzoF0KYI team and see how our cutting-edge AI solutions can supercharge your next project. 💡💻
📍 Want to chat? Let’s connect — our engineers are ready to meet you at the show! 🤝
A few months ago @AnthropicAI announced the Model Context Protocol: an open standard to interconnect language models with external tools and data sources.
Learn how to build your own MCP server in our most recent post:
https://t.co/cCXyBoQpDX
Just in! @NVIDIAAI just took back the speech transcription leaderboard 🏆! Their newest model parakeet-tdt-0.6b-v2 🦜, not only has the lowest word error rate, but has a commercial friendly license!
Try it out in our HuggingFace space 🤗
https://t.co/rqHrfUzrpk
Did you know you can use OpenAI's Python 🐍 API to interact with models hosted locally in your computer? Thanks to Ollama, you can keep your conversations private, offline and secure. Learn how achieve this in our developer's blog📚:
https://t.co/G3fiyfqgWL
Did you miss the #AIIndex2025? No worries! Here's a 10 slide summary of the state of the AI market in 2025, courtesy of @StanfordHAI
Are you ready to add #AI to your project? Let's talk!
https://t.co/IdeHUaXA31
"Pre-training as we know it will end" - Ilya Sutskever
Agentic systems are LLMs with superpowers, and a step in the direction of building truly autonomous systems.
Check out @OpenAI guide to building agents:
https://t.co/DrlaM3ifUi
Calling all makers to join our #InteractiveSignage Contest 2025! Inviting you to explore how smart tech, #IoT, and #AI can transform signs into experiences, making spaces more efficient, comfortable, and engaging.
Submit your project to win $3,000 + and get featured in the smart cities of tomorrow! https://t.co/t6X5vyKj0W
💫 Simplify your AI agent development.
Google's new open-source Agent Development Kit (ADK) is designed for effortless multi-modal and multi-agent systems.
Get started → https://t.co/uUiYKKgbaE
Today is the start of a new era of natively multimodal AI innovation.
Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality.
Llama 4 Scout
• 17B-active-parameter model with 16 experts.
• Industry-leading context window of 10M tokens.
• Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks.
Llama 4 Maverick
• 17B-active-parameter model with 128 experts.
• Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image.
• Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks.
• Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters.
• Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena.
These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight.
Read more about the first Llama 4 models, including training and benchmarks ➡️ https://t.co/9G3QgVdCkB
Download Llama 4 ➡️ https://t.co/eVomRvEr0w
Have you considered creating a custom ChatBot to effortlessly answer questions about your company's internal knowledge base? 🤖💬
At https://t.co/8kgzoF1iOg, we developed an on-premise Retrieval Augmented Generation (RAG) system from scratch and documented our journey for you. 🚀📚
RAG technology enables your business to leverage AI practically and immediately, enhancing productivity and streamlining daily operations.
Learn about our process, insights, and how RAG systems can directly benefit your team's efficiency:
https://t.co/712XTtDXHj
"The AI Biology of LLMs"
Two new wonderful papers by @AnthropicAI following their journey to understand mechanistic interpretability of language models.
If you've ever read the series, you know these are great.
New Anthropic research: Tracing the thoughts of a large language model.
We built a "microscope" to inspect what happens inside AI models and use it to understand Claude’s (often complex and surprising) internal mechanisms.