[BOOKMARK THIS] WholeBodyVLA â the humanoid VLA repo mapping where whole-body robot control is going
This is less âclone and run todayâ and more âstudy the blueprint before everyone else copies it.â
· Vision-Language-Action control for humanoids
â images + language instructions become robot movement commands
· Built for loco-manipulation, not tabletop demos
â walking across space while using both arms
· Learns latent actions from egocentric videos without action labels
â a path toward training from human-like first-person footage
· Decodes into dual-arm joint actions + locomotion commands
· Uses an RL policy aimed at stable whole-body coordination
â balance, walking, reaching, and disturbance handling matter together
· Includes demos like fetching coffee, moving chairs, and long-range tasks
· Also acts as a curated reading list for humanoid VLA research
Why it matters: Humanoid robotics is moving from âstand here and pick this upâ toward robots that can move through real spaces and manipulate objects on the way. WholeBodyVLA is a strong signal for indie robotics builders, AI engineers, and researchers watching the merge of video learning, language control, and full-body policies.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[BUILDER'S DREAM] Browser Harness lets an AI plug into your real Chrome and teach itself the missing moves
This is the browser-agent idea stripped down to the part builders actually want:
one thin harness, one Chrome websocket, and an agent that can edit its own tools mid-task.
· Direct CDP connection to Chrome
â the agent talks to your actual browser, not a boxed-in automation layer
· Editable agent workspace
â when a helper is missing, the agent writes it during execution
· Self-improving runs
â site quirks, selectors, and flows can turn into reusable skills
· Domain skills for real websites
â GitHub, LinkedIn, Amazon-style playbooks instead of rediscovering every click
· Tiny core architecture
â roughly 1k lines across 4 core files, so you can actually understand it
· Works with local Chrome or cloud browsers
â useful for stealth, headless jobs, proxies, captcha handling, and parallel runs
· Built for messy tasks like uploads, outreach, ordering, expenses, and admin work
Why it matters: Most browser agents feel impressive until they hit one weird popup, file picker, login flow, or custom UI. Browser Harness takes the opposite bet: give the agent direct browser access and a place to write the missing glue as it learns.
For indie devs and AI engineers, this is the kind of repo that can turn "AI can browse" from a demo into repeatable automation you can inspect, patch, and share.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[INSANE] Frigate turns cheap IP cameras into a local AI security system you actually control
This is what a home NVR looks like when it is built by people who care about latency, privacy, and not melting your CPU.
· Local object detection for IP cameras
â people, cars, pets, packages, and motion events without cloud video uploads
· Built for Home Assistant from day one
· Uses motion first, then AI only where it matters
â less wasted compute, more real-time responsiveness
· Works with GPUs and AI accelerators
â Coral / detector-style hardware can crush CPU-only setups
· MQTT events for automations
â turn detections into lights, alerts, scripts, dashboards, whatever
· Object-aware recording retention
â keep the clips that matter instead of hoarding endless footage
· 24/7 recording plus low-latency live view
· Built-in masks, zones, review flows, and multi-camera scrubbing
Why it matters: If you have cameras around a home, workshop, lab, driveway, or robot test space, Frigate gives you the missing brain. You get searchable, local, AI-assisted video infrastructure without paying a cloud vendor to watch your footage.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[DEEP DIVE] build-your-own-x â a giant map for recreating real tech from scratch
This is the repo you open when tutorials feel too shallow and docs feel too abstract.
· Build your own LLM, diffusion model, or RAG search system
â learn the moving parts instead of just calling an API
· Recreate Git, Docker, a database, a shell, or a web server
· Dig into 3D renderers, ray tracers, physics engines, and voxel engines
· Follow language-specific guides in Python, Go, C++, JavaScript, Rust, Java, and more
· Learn by building the thing people usually treat as magic
· Pick projects by curiosity level, not course syllabus
· Great âweekend rabbit holeâ material for engineers who want real intuition
Why it matters: Open-source AI is moving fast, but the builders who really win are the ones who understand the layers underneath. This repo gives you a practical path from âI use these toolsâ to âI can build a tiny version myself.â
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[BOOKMARK THIS] Anthropicâs Knowledge Work Plugins â 11 open-source role packs for Claude
This is basically a starter kit for turning Claude from âgeneral assistantâ into âknows how this team works.â
· 11 ready-made plugins for sales, support, PM, marketing, legal, finance, data, bio research, and more
· Each plugin bundles skills, slash commands, connectors, and sub-agents
â Claude gets role-specific workflows instead of generic advice
· File-based setup: markdown + JSON only
â no app server, no build step, no infra project hiding inside
· Connects to real work tools like Slack, Notion, Jira, Linear, HubSpot, Figma, BigQuery, Snowflake
· Slash commands for repeatable work
â think /sales:call-prep, /data:write-query, /finance:reconciliation
· Easy to fork for your companyâs language, process, and tool stack
· Includes a plugin-management plugin for creating or customizing more plugins
Why it matters: Most âAI at workâ fails because the model does not know the workflow, the tools, or the house style. This repo gives builders a clean pattern for packaging that context into something Claude can actually reuse across tasks, teams, and roles.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[BUILDER'S DREAM] cmux â a native macOS terminal built for running AI coding agents in parallel
If your desktop is slowly turning into 12 Claude/Codex tabs and mystery notifications, this one is worth a look.
· Ghostty-based terminal, but with vertical tabs made for agent work
· Blue notification rings show exactly which pane needs you
â no more hunting through tabs that all say âwaiting for inputâ
· Sidebar shows branch, PR status, working directory, ports, and latest agent message
· Built-in browser panes sit beside your terminal
â test localhost, inspect UI, and keep browser state inside the workspace
· SSH workspaces make remote machines feel local
â browser panes route through the remote network, so localhost still works
· Claude Code Teams launches with one command
â teammates spawn as native splits with metadata and notifications
· Reads your existing Ghostty config for fonts, themes, and colors
· Native Swift/AppKit app, not Electron
â fast startup, lower memory, GPU-accelerated terminal rendering
Why it matters: cmux is for devs who already live in the terminal but want agent orchestration without being forced into a heavy GUI workflow. It turns âa pile of terminal panesâ into a readable command center for AI coding sessions, local apps, PRs, SSH boxes, and browsers.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[NEW DROP] Google Antigravity SDK â Python agents with Gemini, state, tools, and streaming built in
This is the kind of repo agent builders should open before writing another custom loop from scratch.
· One async Agent class handles lifecycle, runtime discovery, tools, hooks, and policy defaults
· Starts read-only by default
â safer for codebase agents until you explicitly enable write capabilities
· Real-time streaming responses for CLIs, dashboards, and chat UIs
· Advanced streams for thoughts and tool calls
â show âthinkingâ bubbles or live âexecuting toolâ states in your app
· Stateful Conversation API with history, turn count, last response, and step introspection
· Multimodal inputs: images, video, audio, PDFs, docs, and in-memory bytes
· Local runtime powered by a compiled binary shipped through PyPI
â install with pip, donât just clone the repo
· Interactive loop utility included for building quick terminal agents fast
Why it matters: Most agent prototypes get messy around the same places: state, tool execution, streaming UX, safety defaults, and runtime plumbing. Antigravity SDK wraps those pieces so you can spend more time on the actual behavior of the agent: code navigation, spec review, multimodal analysis, internal tools, or robotics-adjacent command loops.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[TESTED] magic-trace â see every function call before a crash or slowdown, at ~40ns resolution
This is the debugger you want when âit was slow once in prodâ is all you have.
· Attach to a running Linux process like perf
· Captures every function call, not just sampled stack snapshots
â you can inspect the tiny calls normal profilers miss
· Uses Intel Processor Trace under the hood
â records actual control flow into a ring buffer
· Trigger traces on a specific function call
· Grab the last few milliseconds before a crash
â far more useful than only seeing the final stacktrace
· Interactive timeline viewer for zooming through call stacks
· Low overhead: roughly 2%-10% in documented use
· No app changes required to start tracing
Why it matters: If you build latency-sensitive systems, agents, robotics software, infra, or anything with mysterious runtime behavior, magic-trace gives you the âwhat actually happened?â view. It is especially good for weird one-off slow paths where logs are too vague and traditional profilers blur over the interesting part.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[INSANE] LongLive 2.0 â NVIDIAâs open infra for long video generation at 45.7 FPS
Long video AI is usually where demos break: short clips, slow sampling, memory pain.
LongLive 2.0 is the repo for builders who want to push past that.
· NVFP4 video generation â 4-bit weights + activations for faster runs
· 45.7 FPS inference reported for long video generation
· NVFP4 KV cache â stores attention memory lighter during generation
· Sequence-parallel training and inference
â split long video workloads across GPUs instead of choking one card
· Multi-shot video support for autoregressive training
· T2V and I2V workflows
â generate from text, or condition from images/video-style inputs
· DMD few-step distillation
â fewer denoising steps, faster generation loops
· Async decoding + streaming VAE placement
â built for throughput, not just paper demos
Why it matters: LongLive is not just âanother video model repo.â It is infrastructure for people trying to make long, controllable, interactive video systems actually run: AI filmmakers, research engineers, robotics sim builders, and indie hackers experimenting with world-generation loops.
Follow @repocatai_git for more AI / Agent / Robotics drops đ
[BUILDER'S DREAM] Anthropic Cybersecurity Skills â 754 ready-made security playbooks for AI agents
Most agents can write code. Far fewer know what to do when you hand them a memory dump, cloud breach, or suspicious traffic log.
· 754 structured cybersecurity skills your agent can actually follow
· Covers 26 domains: DFIR, threat hunting, malware, cloud, IAM, network traffic, AI risk
· Mapped to MITRE ATT&CK, NIST CSF, ATLAS, D3FEND, and NIST AI RMF
â one skill can double as investigation guidance and compliance context
· Works with Claude Code, Codex CLI, Cursor, Copilot, Gemini CLI, and https://t.co/JaZKScuhJh tools
· Gives agents senior-analyst style procedure instead of vague âlook for indicatorsâ advice
· Useful for building security copilots, SOC agents, pentest assistants, and training labs
· Install with npx or clone it, then point your agent at the skill library
Why it matters: Security work is full of âyou only know this if youâve done it beforeâ judgment calls. This repo turns a huge chunk of that field knowledge into portable agent instructions, so indie builders can prototype credible cyber agents without writing every workflow from scratch.
Follow @repocatai_git for more AI / Agent / Robotics drops đ