"I'd like to cancel my flight from Boston to..." You pause to check a date. The agent cuts in: "Got it, where to?" Now you're talking over it to finish your own sentence.
That's acoustic turn detection. Semantic VAD waits because it knows you're not done: https://t.co/1NPxkPGfyC
ElevenLabs just lost its moat 🤯
They charges $5 to $99/month for AI voice cloning. Their Business plan costs $1,320/month.
Someone open-sourced a Voice AI that clones any voice from just a 3-second audio clip, running 100% locally on your machine.
→ 646 languages - ElevenLabs supports 32
→ Voice design: gender, age, accent, pitch, emotion, dialect
→ Paste a YouTube URL
→ transcribes → translates → re-voices → MP4
→ Global dictation widget: ⌘+⇧+Space from any app
→ Demucs vocal isolation - keeps the background music
→ Pyannote diarization - auto-tags who said what
→ Batch queue: drop 50 videos, walk away
→ MCP server - call it from Claude or Cursor
→ AudioSeal watermarking (by Meta) baked in
100% Open Source. Already 3.6k stars.
GPT-5.6 Can Be Mythos-Level
- GPT-5.6 is expected to launch this week, alongside major updates to Codex
- It can be Anthropic Mythos level at roughly 2–3x lower pricing
- It will bring major improvements in reasoning, frontend generation, personality, and agentic workflows
- Insiders say GPT-5.6 could have easily been called GPT-6
- A major focus appears to be efficiency, making agents faster, cheaper, and more practical for longer workflows
- This week could bring one of the biggest AI model launches of the year.
Un laboratorio chino acaba de humillar a media industria del video.
Subes una foto y un audio, sale un avatar hablando en sincro. Open source.
Lo que antes era agencia, cámara y edición ahora parece un repo.
Se llama LongCat-Avatar.
OpenJarvis: a local-first personal AI is now available to run with Ollama
Built by Stanford’s @HazyResearch and Scaling Intelligence labs, as part of their “Intelligence Per Watt” research into efficient local AI. @Stanford
Learn more in the blog post 👇👇👇
Introducing Realtime TTS-2, a new generation of voice model built for realtime conversation.
It is the first voice model that hears the conversation, takes natural-language voice direction, holds one voice identity across over 100 languages, and speaks like a person who is paying attention.
The result is voice AI that feels as good as it sounds.
Try it out: https://t.co/80xL7AJveV
Learn More: https://t.co/PLUiAEFizP
@GradiumAI@openclaw Great people, ideas and projects
@GradiumAI
, keep doing the good work guys! we are using your system for our clients @ketsu :) Thanks
@ColinHeilbut
for having us at your office
By 8 PM, Gizmo had booked Colin's massage.
Gizmo is built on @openclaw with Gradium as the Speech provider.
Just ask your Openclaw agents to switch to Gradium today.
https://t.co/hBFtw0JWo4
ElevenLabs just lost its moat 🤯
Someone has open-sourced a single app that replaces ElevenLabs AND WisprFlow and runs 100% locally.
→ Clone any voice from a 3 seconds of audio
→ 7 TTS engines under one roof
→ 23 languages: Arabic, Hindi, Japanese, you name it
→ Built-in MCP server so Claude Code, Cursor, and Cline can speak back to you in a voice you cloned
→ Local LLM rewrites your voice in-character before TTS
→ Pedalboard effects (reverb, pitch shift, chorus) baked in
It's built on Tauri (Rust), not Electron. Runs on MLX for Apple Silicon, CUDA, ROCm, Intel Arc, DirectML, and CPU.
ElevenLabs Creator is $99/month. WisprFlow Pro is $15/month.
Voicebox is $0. 23.4K stars on GitHub. MIT license.
SOMEONE BUILT THE MOST COMPREHENSIVE CLAUDE CODE SYSTEM ON THE INTERNET AND OPEN SOURCED THE ENTIRE THING.
55 agents. 208 skills. 72 slash commands.
Built and won at the Anthropic x Cerebral Valley hackathon.
10 months of daily real-world use before it was ever published publicly.
This is not a collection of prompts someone threw together over a weekend.
This is a production-grade agent harness that has been stress-tested across thousands of real sessions and refined until it works reliably at scale.
Here is what you actually get when you install it.
55 specialized agents each built for a specific function. Not one agent trying to do everything. 55 agents each doing one thing exceptionally well.
208 skills covering every repeating workflow a serious builder runs. Research. Code review. Documentation. Testing. Deployment. Content. Analysis. Each one built once and callable forever.
72 slash commands that compress complex multi-step workflows into a single word.
A security scanner called AgentShield that audits your entire Claude Code configuration for vulnerabilities, misconfigurations, and injection risks across 5 categories before you deploy anything.
Cross-harness support for Claude Code, Codex, Cursor, OpenCode, and Gemini so the investment you make in this system is not locked to one tool.
A dashboard GUI with dark and light theme so you can monitor your entire agent operation from one screen.
Memory persistence that carries context across sessions so you never start from zero.
1,282 tests. 98% coverage. 102 static analysis rules.
This is the infrastructure layer most builders are trying to assemble piece by piece from 15 different repos.
Someone already built the complete version. Won a hackathon with it. Then gave it away for free.
The builders who install this this weekend will have a Claude Code setup that took 10 months of daily iteration to build.
Installed in one afternoon.
https://t.co/2Er9PREAih
Star it. Fork it. Build on top of it.
Bookmark this.
Follow @cyrilXBT for every Claude Code repo worth your weekend the moment it surfaces.
𝗛𝗲𝗿𝗺𝗲𝘀 𝗽𝗹𝘂𝘀 𝗢𝗽𝗲𝗻𝗖𝗹𝗮𝘄 𝗽𝗹𝘂𝘀 𝗣𝗮𝗽𝗲𝗿𝗰𝗹𝗶𝗽 𝗶𝘀 𝗮 𝗳𝘂𝗹𝗹 𝗔𝗜 𝗰𝗼𝗺𝗽𝗮𝗻𝘆 𝘆𝗼𝘂 𝗿𝘂𝗻 𝘆𝗼𝘂𝗿𝘀𝗲𝗹𝗳.
And each one plays a different role.
Here's how the stack fits:
→ Hermes is the memory. It learns and remembers across chats.
→ OpenClaw is the hands. It runs your apps, files, and browser.
→ Paperclip is the manager. It assigns tickets and caps budgets.
→ Hire a CEO, engineer, or marketer agent for each goal.
→ One developer runs 15 agents across 3 machines.
One guy fixed a production bug by voice while walking his dog.
Approve big calls. Review the audit logs. Stay in control.
The people who layer these three first will run a real team while everyone else types one prompt.
Want the SOP? DM me. 💬
Anthropic's most viral feature is now open-source!
Until now, Anthropic's Generative UI capabilities only existed inside its own products.
@CopilotKit just shipped Open Generative UI, an open-source implementation of Claude Artifacts that works in any app.
The agent generates HTML/SVG at runtime, and CopilotKit streams it token-by-token into a sandboxed iframe inside the app's chat.
So the user can watch the UI assemble itself in real time, not after the full response is ready.
The sandbox is fully isolated with no access to the parent app, the DOM, or user data. So if the agent hallucinates broken markup or unexpected JavaScript, nothing leaks outside the iframe.
Under the hood, the agent does not select from pre-built components. Instead, it generates arbitrary visuals from scratch every time.
The output is unconstrained by default, but you can shape it by defining prompt-based skills that teach the agent specific visual formats or guidelines.
For instance, a skill prompt can guide the agent toward producing a Chart.js dashboard with proper axis labels and responsive sizing, or an interactive 3D model with rotation controls.
The video below shows this in action, and the output quality you see actually comes from the skills layer.
Open Generative UI runs on AG-UI, so it works out of the box with LangGraph, CrewAI, Mastra, Google ADK, AWS Strands, and more.
It also ships with a standalone MCP server that plugs into Claude Code, Cursor, or any MCP-compatible client.
And the entire stack is built on top of CopilotKit, the open-source frontend framework for agents and generative UI. 30k+ GitHub stars, with SDKs for React, Next.js, Angular, and Vue.
I have shared the GitHub repo and a live playground in the replies!