Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
Claude for Excel, PowerPoint, and Word are now generally available, and Claude for Outlook is in public beta.
As Claude moves between your Microsoft apps, it carries the full context of your conversation.
Introducing Claude Opus 4.7, our most capable Opus model yet.
It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back.
You can hand off your hardest work with less supervision.
Claude for Word is now in beta.
Draft, edit, and revise documents directly from the sidebar. Claude preserves your formatting, and edits appear as tracked changes.
Available on Team and Enterprise plans.
Claude Cowork is now generally available to all paid plans.
For Enterprise, we are adding role-based access controls, group spend limits, usage analytics, and expanded OpenTelemetry to give admins what they need to deploy it across the org.
We're bringing the Claude in Office ecosystem closer together. Would love to know what you think. I poured my soul into this one and want it to feel great!
Claude for Excel and Claude for PowerPoint now sync together seamlessly.
When you’ve got more than one file open, Claude shares the full context of your conversation between them.
Pull data from spreadsheets, build out tables, and update a deck — without re-explaining a step.
This is Claude Sonnet 4.6: our most capable Sonnet model yet.
It’s a full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design.
It also features a 1M token context window in beta.
Simile is out of stealth!
At Simile, we have built the first AI simulation of society, populated by agents based on real humans.
We are building a foundation model that predicts human behavior in any situation, and a product that deploys it at scale.
Thrilled to be on this mission.
Truly privileged to have worked on this!
We're bringing Claude to millions of knowledge workers - beyond just coding.
If you're heavy in finance (or just an Excel nerd), let us know what you think.
Claude in Excel is now available on Pro plans.
Claude now accepts multiple files via drag and drop, avoids overwriting your existing cells, and handles longer sessions with auto compaction.
Get started: https://t.co/cAMDXM1h7r
Finally, AI designs you can edit.
Moda - create brand assets on a fully editable canvas.
This is not another ChatGPT or nano banana wrapper.
We actually taught an agent layout, typography, and color. We gave it taste.
Use it for slides, social, infographics, e-books …
@arpit_bhayani Pretty good for a high-level description!
Causal masking is a pretty important omission though. I would focus on that to improve this further.
Each token should only be able to attend to prior tokens during auto regression.
We spent the first year at @GentraceAI helping customers run traditional LLM evaluations. But we realized late last year that agents changed everything.
Agents don't just generate text - they take actions, make decisions, and interact across multiple steps.
Traditional "hallucination" checks can't evaluate whether your agent booked the right flight or resolved the customer’s support ticket correctly.
Old LLM observability tools were built for simple input/output prompts. They break in an agentic world.
For the last 9 months, we rebuilt Gentrace from first principles 🔥 Here's our new approach:
- Chat with your AI trace data using our agent to discover what's actually breaking
- Describe problems in plain English
- Our agent creates custom AI-powered monitoring columns that catch these failures across all future traces
Example: Instead of checking "factual accuracy," you can now evaluate complex behaviors:
- "Figure out if a user is frustrated with my AI agent in the customer support chat"
- "Detect when my agent takes more than 5 steps to answer simple queries."
We stopped measuring outputs and started measuring outcomes. Watch @dougsafreno break down how to solve agent observability with Gentrace. 👇
Agents are significantly more powerful than standalone LLM calls. But, debugging them is a nightmare.
You can trace their reasoning and tool use, but traces get huge and are impossible to parse.
To solve this, we spent the lasts several months building Gentrace for Agents, which puts our own agents to work on yours.
In Gentrace for Agents, you can:
• Chat with AI to debug agent traces
• Create smart monitoring columns
• Build out tailored evaluations
It’s like a giant AI powered spreadsheet over your trace data, with a Cursor-style chat sidecar. If it sounds a little meta, it is, but it is very powerful in practice.
We recorded this video to show you how it works. Take a look, and let me know what you think: