Breno Thales

@brenothales

Software Engineer

Florianópolis, Brasil

Joined August 2009

1.4K Following

369 Followers

170 Posts

brenothales retweeted

Rahul

@sairahul1

4 days ago

This is the best site on the internet to learn harness engineering. Free. Completely. Most AI engineers have never heard the term. https://t.co/bwDbTTYsjM Bookmark this site. Then read this setup ↓

sairahul1's tweet photo. This is the best site on the internet to learn harness engineering.

Free. Completely.

Most AI engineers have never heard the term.

https://t.co/bwDbTTYsjM

Bookmark this site.

Then read this setup ↓ https://t.co/ddEP0XowXM

438

437K

brenothales retweeted

Avi Chawla

@_avichawla

7 days ago

Anthropic's in trouble, again! They spent years building what's now fully open-source. What made Claude feel different from a normal app is that the agent could act inside the interface instead of only talking in a chat box. For instance, Claude Artifacts let an agent render real UI, charts, dashboards, and interactive components that assemble live inside the response. Every major AI product tried to replicate it. But the problem was that unlike reasoning, planning, tool-calling, etc., none of it shipped natively with LangGraph, CrewAI, or Google ADK. So teams started building an owned version that required engineering the entire interface layer from scratch. Most teams, however, just settled for shipping the agent as a backend API in a chat box since rendering the UI is only one piece of it. To actually make it work, the interface layer also needed real-time streaming, state kept in sync between agent and UI, conversations that persist across sessions, and reconnection when a user refreshes mid-run. @CopilotKit is now the only open-source framework that actually lets you build your own full-stack Claude-like apps. It decouples the agent from the interface, talking over AG-UI (an open protocol for agent-to-user communication). Being a standard protocol, the frontend never needs to know whether it is talking to a LangGraph or a CrewAI agent. You can change the backend anytime and the UI will never notice. In practice, CopilotKit's interface layer gives several pre-implemented React building blocks that wire the agent directly into the app, like: - generative UI, so the agent renders real components instead of text - chat windows, sidebars, and popups, or a fully headless setup - shared state, so the agent and app stay in sync - human-in-the-loop approvals, where the agent waits before acting - persistent threads that store the whole session, including the agent-user interactions and generated UI, not just text And because that full history is captured, those interactions can feed a self-learning layer that also improves the agent from real usage over time. The interface layer that Anthropic spent years engineering in-house is now literally available to any developer/team. CopilotKit is open-source with 30k+ GitHub stars, and AG-UI, the protocol underneath, is already supported across every major agent framework: LangGraph, CrewAI, Mastra, Google ADK, and more. CopilotKit GitHub repo → https://t.co/wkQ1taF0rM (don't forget to star it ⭐ ) If you want to go deeper, I found a detailed breakdown by Shubham Saboo recently on the three Generative UI patterns, with implementation. Read it below.

231

447K

brenothales retweeted

唐华斑竹🦅

@uniswap12

7 days ago

微软开源了一个语音 AI，60 分钟长音频一次转写，4 个人同时说话都能搞定 VibeVoice，微软开源，24.8k star，今天才知道这个。录音一键转文字这件事，我之前一直用 Whisper，但它处理长会议录音经常超时，多人说话识别错得也挺厉害的。 VibeVoice 这个直接支持 60 分钟连续音频，自带说话人分离和时间戳，四个人同时说话的场景也能分清楚谁说了什么。最让我意外的是 TTS 这边，4 个角色同时合成，90 分钟连贯输出，声音全程不跑偏。想做有声书或者播客内容的同学应该会很感兴趣，以前多角色合成经常前后声音不一致，这个解决了。底层是 Qwen2.5 加了专门的连续语音 tokenizer，还有个 0.5B 的轻量版本，300ms 延迟，可以直接接进对话 AI 做实时语音交互，不用再单独接第三方 TTS 服务了。正在想把 ASR 这块接进自己的会议记录工具里，如果真能稳定跑，一个会下来自动生成带发言人标注的纪要，那效率真的拉满了。开源地址： https://t.co/XKcsCWGmcE #AI #AIAgent

uniswap12's tweet photo. 微软开源了一个语音 AI，60 分钟长音频一次转写，4 个人同时说话都能搞定

VibeVoice，微软开源，24.8k star，今天才知道这个。录音一键转文字这件事，我之前一直用 Whisper，但它处理长会议录音经常超时，多人说话识别错得也挺厉害的。

VibeVoice 这个直接支持 60 分钟连续音频，自带说话人分离和时间戳，四个人同时说话的场景也能分清楚谁说了什么。最让我意外的是 TTS 这边，4 个角色同时合成，90 分钟连贯输出，声音全程不跑偏。

想做有声书或者播客内容的同学应该会很感兴趣，以前多角色合成经常前后声音不一致，这个解决了。底层是 Qwen2.5 加了专门的连续语音 tokenizer，还有个 0.5B 的轻量版本，300ms 延迟，可以直接接进对话 AI 做实时语音交互，不用再单独接第三方 TTS 服务了。

正在想把 ASR 这块接进自己的会议记录工具里，如果真能稳定跑，一个会下来自动生成带发言人标注的纪要，那效率真的拉满了。

开源地址： https://t.co/XKcsCWGmcE

#AI #AIAgent

362

123K

Breno Thales @brenothales

11 days ago

I use Chrome DevTools! #DevBadges https://t.co/YMSvEMVBWb

Who to follow

Glaucio Melo

@glauciovillamor

•Cristão • Marido e Pai • Sócio & Head de Business Ops @ajudapayoficial Negócios, operações e performance • Conta Digital | Pix | API • Embedded finance

eu real

@CelsoRocha

Temas de interesse tecnologia, fotografia e design!!!

brenothales retweeted

Akshay 🚀

@akshay_pachaar

23 days ago

RAG vs. CAG, clearly explained! RAG is great, but it has a major problem: Every query hits the vector DB. Even for static information that hasn't changed in months. This is expensive, slow, and unnecessary. Cache-Augmented Generation (CAG) addresses this issue by enabling the model to "remember" static information directly in its key-value (KV) memory. In fact, you can combine RAG and CAG for the best of both worlds. Here's how it works: RAG + CAG splits your knowledge into two layers: ↳ Static data (policies, documentation) gets cached once in the model's KV memory ↳ Dynamic data (recent updates, live documents) gets fetched via retrieval This gives faster inference, lower costs, and less redundancy. The trick is being selective about what you cache. Only cache static, high-value knowledge that rarely changes. If you cache everything, you'll hit context limits. Separating "cold" (cacheable) and "hot" (retrievable) data keeps this system reliable. You can start today. OpenAI and Anthropic already support prompt caching in their APIs. I have shared my recent article on prompt caching below if you want to dive deeper. Have you tried CAG in production yet? Below, I have quoted an article that I wrote on prompt cashing and how Claude Code achieves a 92% cache hit-rate. Give it a read.

300

277K

brenothales retweeted

CyrilXBT

@cyrilXBT

21 days ago

ANTHROPIC JUST SOLVED THE PROBLEM THAT HAS BEEN HOLDING BACK EVERY CLAUDE CODE USER WHO IS NOT A SENIOR ENGINEER. One free plugin. It scans your project. It configures everything. Done. Here is the problem most people do not talk about with Claude Code. The tool is extraordinarily powerful when configured correctly. The configuration is extraordinarily painful to get right. Which MCP servers does your project actually need. Which hooks should fire for your specific workflow. Which subagents are relevant for your tech stack. Which slash commands would save you the most time. Answering those four questions correctly requires reading your entire project structure, understanding your dependencies, analyzing your code patterns, and cross-referencing all of that with what each available server and subagent actually does. Most developers skip this entirely. They install Claude Code with default settings. They get default results. They conclude Claude Code is good but not transformative. They are wrong about why. The official Anthropic plugin removes the manual configuration layer entirely. Point it at your project. It reads your structure. Your dependencies. Your code patterns. It installs the right MCP servers for your specific stack. It configures the right hooks for your specific workflow. It sets up the right subagents for your specific use cases. It creates the right slash commands for your specific repeating tasks. One scan. Perfect configuration. Free. The developer who has been running Claude Code at 40% capacity because setup was too painful just had every configuration problem solved automatically. The developer starting a new project today skips the entire painful configuration phase entirely. 100% free. Official Anthropic release. Install it before you open your next project. Bookmark this. Follow @cyrilXBT for every Claude Code release that changes what one person can build.

297

369

19K

brenothales retweeted

Georgi Gerganov

@ggerganov

24 days ago

llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump with these changes is massive and elevates local inference on commodity hardware further. Special thanks to Aman Gupta for leading this development! https://t.co/vjaMwEpIaR

182

534

273K

brenothales retweeted

Anatoli Kopadze

@AnatoliKopadze

24 days ago

the engineer who built Claude Code just dropped a 28-minute video on how to write prompts that actually work I've seen $300 courses that don't cover what he shows in the first 10 minutes CLAUDE.md files, memory shortcuts, parallel sessions, prompting patterns all in one video and completely free works whether you're a developer, a beginner, or someone who's been using Claude for months based on this, I put together 18 things you can copy and use in Claude today full guide in the article below

224

24K

61K

brenothales retweeted

Guillermo Casaus

@_guillecasaus

28 days ago

🚨 Google acaba de liberar sus skills oficiales para agentes de IA. Ha publicado 13 skills compatibles con Claude Code, Cursor, Copilot y otros agentes. Permiten que los agentes puedan ejecutar tareas avanzadas y automatizar flujos de trabajo complejos. Es gratis y open-source 👇

_guillecasaus's tweet photo. 🚨 Google acaba de liberar sus skills oficiales para agentes de IA.

Ha publicado 13 skills compatibles con Claude Code, Cursor, Copilot y otros agentes.

Permiten que los agentes puedan ejecutar tareas avanzadas y automatizar flujos de trabajo complejos.

Es gratis y open-source 👇

403

361K

brenothales retweeted

Spring I/O @spring_io

about 1 month ago

🍃 Claude Code for Spring Developers by @tschuehly @ Spring I/O 2026 ▶️ Video: https://t.co/7SrenNN0yJ 🔗 Slides: https://t.co/RiJSjZD6J7 #springio26

103

brenothales retweeted

Felpz Crypto

@FelpsCrypto

about 1 month ago

A NVIDIA ESTÁ LITERALMENTE OFERECENDO INFERÊNCIA DE IA GRATUITAMENTE Eu configurei em 5 minutos e não conseguia acreditar que era grátis. DeepSeek, MiniMax, Kimi, GLM, Llama - tudo na Nuvem DGX da NVIDIA via API limpa e compatível com OpenAI. Configuração em 5 min: → https://t.co/IQLvQgncp5 → obtenha a chave da API → base_url = https://t.co/9UW6eZAJbz → insira em qualquer SDK OpenAI Temos usado. Sim, fica mais lento sob carga pesada. Sim, o plano gratuito tem limitações. Mas para desenvolvedores solo, programadores independentes e estudantes de engenharia de IA? Este é o melhor ambiente de testes gratuito que existe atualmente. Pare de pagar US$ 20/mês para experimentar. Use isso primeiro.

648

688

61K

brenothales retweeted

Tom Dörr

@tom_doerr

about 2 months ago

AI-driven task management system https://t.co/jpLANL3dmS

105

132

Breno Thales @brenothales

about 2 months ago

brenothales retweeted

Google Cloud Tech

@GoogleCloudTech

about 2 months ago

Our official Agent Skills repository on @github is here! Skills are a simple, open format for giving agents new capabilities and expertise. Think of a skill as compact, agent-first documentation for a specific tech or task. Learn more → https://t.co/7w887vz3lE #GoogleCloudNext

GoogleCloudTech's tweet photo. Our official Agent Skills repository on @github is here!

Skills are a simple, open format for giving agents new capabilities and expertise. Think of a skill as compact, agent-first documentation for a specific tech or task.

Learn more → https://t.co/7w887vz3lE #GoogleCloudNext https://t.co/ltPAgCSUaU

748

456K

brenothales retweeted

Hasan Toor

@hasantoxr

about 2 months ago

Goodbye Claude Code subscription fees. Someone just built a proxy that runs Claude Code completely free... and it's wild. You literally plug in a free NVIDIA API key and point Claude Code at localhost. That's it. It handles everything: - Converts Anthropic API calls to NVIDIA NIM format - Unlocks 40 requests/min for free - Supports Kimi K2, GLM 4.7, MiniMax M2, Devstral and more - Streams thinking tokens and tool calls live - Even includes a Telegram bot so you can run Claude Code from your phone No API bill. No rate limit panic. No vendor lock-in. Honestly, this goes beyond router tools like OpenRouter. It doesn't just swap the model... it turns Claude Code into a free agent you can control remotely. The project is open-source on GitHub. It's called free-claude-code.

hasantoxr's tweet photo. Goodbye Claude Code subscription fees.

Someone just built a proxy that runs Claude Code completely free... and it's wild.

You literally plug in a free NVIDIA API key and point Claude Code at localhost.

That's it.

It handles everything:
- Converts Anthropic API calls to NVIDIA NIM format
- Unlocks 40 requests/min for free
- Supports Kimi K2, GLM 4.7, MiniMax M2, Devstral and more
- Streams thinking tokens and tool calls live
- Even includes a Telegram bot so you can run Claude Code from your phone

No API bill. No rate limit panic. No vendor lock-in.

Honestly, this goes beyond router tools like OpenRouter.

It doesn't just swap the model... it turns Claude Code into a free agent you can control remotely.

The project is open-source on GitHub.

It's called free-claude-code.

194

935

10K

646K

brenothales retweeted

Manisha Mishra

@manishamishra24

about 2 months ago

Learn AI for free directly from top companies. 1 - Anthropic: https://t.co/dNqNcxEzUr 2 - Google: https://t.co/CjJlWf9uPB 3 - Meta: https://t.co/o5Em5HiIMu 4 - NVIDIA: https://t.co/zxswNmOw6r 5 - Microsoft: https://t.co/O3bCUzm2Xv 6 - OpenAI: https://t.co/qqxdElNm5j 7 - IBM: https://t.co/SlOeRvKq5P 8 - AWS: https://t.co/N3TGaGM8Cc 9 - https://t.co/2ZGq3Bn54S: https://t.co/MwtW2w4pgu 10 - Hugging Face: https://t.co/ZC3OLTh4Mp 👇Comment "Learning" if you find this helpful. Repost so others can take help. Must bookmark for future reference.

manishamishra24's tweet photo. Learn AI for free directly from top companies.

1 - Anthropic:
https://t.co/dNqNcxEzUr

2 - Google:
https://t.co/CjJlWf9uPB

3 - Meta:
https://t.co/o5Em5HiIMu

4 - NVIDIA:
https://t.co/zxswNmOw6r

5 - Microsoft:
https://t.co/O3bCUzm2Xv

6 - OpenAI:
https://t.co/qqxdElNm5j

7 - IBM:
https://t.co/SlOeRvKq5P

8 - AWS:
https://t.co/N3TGaGM8Cc

9 - https://t.co/2ZGq3Bn54S:
https://t.co/MwtW2w4pgu

10 - Hugging Face:
https://t.co/ZC3OLTh4Mp

👇Comment "Learning" if you find this helpful.

Repost so others can take help.

Must bookmark for future reference.

371K

brenothales retweeted

Matt Dancho (Business Science)

@mdancho84

2 months ago

RIP document extractors. Google just released LangExtract: Open-source. Free. Better than $100K enterprise tools. Here’s what it does: 🧵

mdancho84's tweet photo. RIP document extractors.

Google just released LangExtract: Open-source. Free. Better than $100K enterprise tools.

Here’s what it does: 🧵 https://t.co/KiJC5y99fB

283

185K

brenothales retweeted

Tech with Mak

@techNmak

2 months ago

Someone removed the vector database from RAG and got better results. Much better. Here's what traditional RAG actually does under the hood: it chunks your document into pieces, embeds those pieces into vectors, and retrieves based on semantic similarity. The assumption is that similar text = relevant text. That assumption breaks completely for professional documents. When you ask "what were the debt trends in Q3?", vector search returns chunks that look similar to that question. But the actual answer might be buried in an appendix, referenced across three sections, in a part of the document that shares zero semantic overlap with your query. Traditional RAG never finds it. Similarity ≠ relevance. PageIndex was built around that insight. Inspired by AlphaGo, it builds a hierarchical tree index from your document - an intelligent table of contents optimized for LLM reasoning. Then it navigates that tree the way a human expert would. Not pattern matching. Reasoning. "Debt trends are usually in the financial summary or Appendix G, let's look there." What disappears: → No vector DB to build or maintain → No arbitrary chunking that breaks cross-section context → No opaque retrieval you can't explain or trace What you get: → Retrieval traceable to exact page and section references → Multi-step reasoning across document structure → Works on financial reports, legal filings, regulatory documents The benchmark: → PageIndex: 98.7% on FinanceBench → Perplexity: 45% → GPT-4o: 31% Open source.