Stream's agent skills can create an entire messaging app in one prompt!
The skills explain how to build a Stream app to your coding agent.
Prompt it, make some tea, come back to your working app. ๐ซ
Link in ๐งต
Real-time avatars are now available in Vision Agents with @Anam__ai to bring custom, responsive experiences to the world!
Here's one cleverly deciding not to trust @stefanjblos with all its company's money... write up in ๐งต
To pineapple or to not pineapple?
@max_does_tech built a voice agent using @inworld_ai's Realtime API to answer the age old question ๐
Quickstart: https://t.co/DxwUujt9MZ
Example code: https://t.co/VZQPfioVSm
This entire example is only 87 lines of code ๐คฏ
This fully-local processor pipeline with @huggingface Transformers object inference and segmentation is running 100% in realtime on a Macbook with the @visionagents_ai SDK!
v0.5.0 of the Vision Agents SDK is out now!
New in this release: run agents directly on your hardware, Anam avatar integration, way faster @DeepgramAI TTS, @AssemblyAI support, and much more.
Details in ๐งต๐
Letโs build a vision + voice agent with the new Gemini 3.1 Flash Live model ๐ฅ
Following this tutorial, you'll build a multimodal agent that helps you sell your used items!
Check out the full tutorial on the @googledevs YouTube channel!
https://t.co/AAIJZ4fwaY
Gemini 3.1 Flash Live just dropped, check out our demo with it! ๐
This @googleaidevs native audio model now comes with lower latency, stronger instruction following and more reliable tool calling.
We'll share the full demo soon!
Using @roboflow's Neural Architecture Search to make a video moderation bot with under 1.8ms average inference time!
In this demo we're able to moderate video coming in on a video call so quickly that you almost don't see the offending content before it's censored ๐
added Cursor cli with Composer 2 to the eval based on community feedback
#2 coding agent for vision tasks
updated blog post if you want to see the details
https://t.co/JORJUC85YT
would like to add more evals, share what issues you're having with vision tasks and we can add them
Here's @NVIDIAAIDev Nemotron-3-Super-49B used in a real-time Vision Agents application as a fraud assistant! You can see every action it takes, and it's all happening in real-time ๐ฎ
Using the Nemotron model hosted on @baseten for reliability.
v0.4 of Vision Agents is here! Here's the most intense video possible to catch you up on the big hits ๐คทโโ๏ธ Link to the VA GitHub (7K stars, join us!) in bio
Telling stories with the new GPT-5.4 and got into a fight with it... I want to end the story but it wants to keep playing.
This is a Vision Agents app where the video is being streamed by an SFU to a k8s instance running the actual agent
This is @GoogleDeepMind Gemini 3.1 Flash-Lite responding in real time in a Vision Agents app. It's able to handle a lot of different video understanding questions much more quickly than the previous gen... and this is on release day, when everyone's hitting the API! ๐