Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
Introducing frame.md, a spec built for videos & motion
design.md kept your brand consistent across screens
but when applied to videos, agents translated it back into webpages and decks
frame.md teaches your agents how to make branded video
turn your design.md into frame.md ↓
Welcome to the NVIDIA RTX Spark channel.
A new superchip for the age of personal AI.
Don't worry, your favorite NVIDIA local AI content continues on right here, just with a new headliner.
Let's get started...
30 anos.
Por 30 anos o PC foi a mesma coisa: Intel ou AMD dentro, GPU do lado, e torce pra não travar.
A NVIDIA acabou com isso numa keynote.
RTX Spark. Primeiro chip deles para computador pessoal. CPU, GPU e memória num único silício. ARM, 3nm, 1 petaflop de IA local.
Num laptop de 14mm.
Rodou Forza Horizon 6 e 007 First Light no palco a 100 FPS em 1440p. Fora da tomada. Sem throttling. No Windows.
O número que muda tudo: roda modelos de IA de 120 bilhões de parâmetros sem cloud. Sem API. Sem assinatura. Seu agente de IA mora na sua máquina. Ligado 24 horas. Só seu.
O PC não é mais uma tela com teclado. É uma estação de IA pessoal.
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
@adityarao310@new_discord_tea I spent the whole weekend trying to get it running on a 24GB RTX 3090 and had no luck. It kills the machine while loading the weights.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Introducing Zero
The programming language for agents.
I wanted a systems language that was faster, smaller, and easier for agents to use and repair.
Explicit capabilities. JSON diagnostics. Typed safe fixes.
Made for agents on day zero.
Today we're sharing our work on interaction models. A new class of model trained from scratch to handle real-time interaction natively, instead of gluing it onto a turn-based one.
https://t.co/MoS5s4cm60
Introducing SubQ - a major breakthrough in LLM intelligence.
It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA),
And the first frontier model with a 12 million token context window which is:
- 52x faster than FlashAttention at 1MM tokens
- Less than 5% the cost of Opus
Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention).
Only a small fraction actually matter.
@subquadratic finds and focuses only on the ones that do.
That's nearly 1,000x less compute and a new way for LLMs to scale.
Speed up your Gemma 4 workflows by up to 3x with Multi-Token Prediction (MTP) drafters.
Standard LLM inference is fundamentally memory-bandwidth bound, creating a latency bottleneck as billions of parameters travel from VRAM just to generate a single token. We're working to ease this bottleneck with MTP drafters for @googlegemma 4.
Starting June 1st, GitHub Copilot will move to a usage-based billing model as GitHub Copilot supports more agentic and advanced workflows.
In early May, you'll see a preview bill experience, giving visibility into projected costs before the transition.
👉 Read more about the upcoming change: https://t.co/4IC9VNHwhk
Kimi K2.6 can now run on CPU, GPU and SSD setups! 🔥
We shrank the 1T model to 340GB via Dynamic GGUFs where important layers are upcasted.
Run at >40 tok/s on 350GB RAM/VRAM setups.
Run full precision on 610 GB.
Guide: https://t.co/nGGHiZhNPG
GGUF: https://t.co/nK6TkXRITf