Excited to release Opus 4.8 today! We heard your feedback on 4.7 and have made many fixes for 4.8.
4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
Still manually wiring agents through terminal commands?
Clawnetes delivers a native desktop wizard that installs, configures, and launches AI agents in two minutes.
- React 18 + Vite frontend
- Rust + Tauri v2 backend for system calls, file writing, shell execution, SSH tunneling
- Native on macOS, Windows, remote Linux via SSH
Bypass the terminal entirely for a seamless desktop experience.
Still manually wiring agents through terminal commands?
Clawnetes delivers a native desktop wizard that installs, configures, and launches AI agents in two minutes.
- React 18 + Vite frontend
- Rust + Tauri v2 backend for system calls, file writing, shell execution, SSH tunneling
- Native on macOS, Windows, remote Linux via SSH
Bypass the terminal entirely for a seamless desktop experience.
Tarballs still cross the network on every fresh clone.
A content-addressed cache with hard links reuses the exact bytes from prior installs.
Only new versions ever download.
Your monorepo bootstrap now completes before the coffee finishes brewing.
This is how dependency management stops being the bottleneck in every new environment.
Perplexity released Bumblebee as open source. The tool collects read only inventory of packages, browser extensions, and other developer tooling on macOS and Linux systems. Its main purpose is fast supply chain risk assessment for security teams. Instead of broad scans or agent based approaches, Bumblebee focuses on pulling metadata directly from endpoints without modifying anything. This makes it lightweight and suitable for regular checks across developer machines. The project is fresh, with the first public release landing just today. Code, docs, and build setup are all available on GitHub.
Perplexity released Bumblebee as open source. The tool collects read only inventory of packages, browser extensions, and other developer tooling on macOS and Linux systems. Its main purpose is fast supply chain risk assessment for security teams. Instead of broad scans or agent based approaches, Bumblebee focuses on pulling metadata directly from endpoints without modifying anything. This makes it lightweight and suitable for regular checks across developer machines. The project is fresh, with the first public release landing just today. Code, docs, and build setup are all available on GitHub.
Still managing keys and permissions across local Ollama and cloud providers?
Clawnetes installs OpenClaw and Hermes Agent with sandboxing ready.
- Sandbox modes and tool policies with granular controls
- Local models from Ollama or LM Studio stay private in isolated runs
- Cloud support for xAI, Anthropic, OpenAI, Google, OpenRouter
- API keys, OAuth 2.0, profile auth, or device tokens
Grok agents in Hermes get the same protections without manual setup.
You can now use your @grok or X Premium subscription in @opencode.
Use the model powering Grok Build for high speed and codebase intelligence.
https://t.co/8D2F9jYoIQ
Still managing separate terminals and contexts for every coding task?
Hermes spawns subagents with isolated sessions and full tool access.
- Each child gets its own working directory and state
- Orchestrator stays under max_spawn_depth of 1
- Leaf agents cannot delegate further, keeping things contained
- Results return as clean summaries without context pollution
One delegation call now handles parallel research, reviews, and builds.
This is how teams move from prompt hacking to structured workflows.
Your LLM server still allocates a contiguous KV cache for every sequence.
PagedAttention splits the cache into fixed-size blocks that can be allocated on demand.
Different requests with wildly different lengths now share the same GPU memory pool.
You fit 4x more concurrent users before hitting the memory limit.
This is how you stop over-provisioning GPUs for the worst-case context.