I’m incredibly grateful to Anthropic for building Claude Code. What CC has enabled me to create over the last 6 months would have been completely unimaginable to me just 12 months ago.
@MatthewBerman /goal refactor until you are happy with the architecture. ensure you live test after each significant step and autoreview/commit. track progress in /tmp/refactor-{projectname}.md
@Teknium@SemiAnalysis_ This is super wrong… I’ve been running 6 accounts for 12+ months. Right now, the regular skew without promo and resets is around 22:1 for both Codex and CC
This is wrong…. So wrong that it hurts. They probably tested Codex in the 2x limit promo during weeks they reseted the limits twice or so. I’ve been running 6 accounts for over 12 months and the regular skew is around 22:1 for both Claude Code and Codex
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)
Design is full of codewords. Knowing them changes what you can ask for, and what you can get back, whether you're working with devs, or an AI.
“tint this neutral color”, “fix this widow”, “nudge it to the optical center”
I wrote them down: https://t.co/aFyd5avj9o
Lots of people asked how I used Fable to edit its own launch video so I made a video about that!
TLDR it wrote a lot of code & tool calls to use transcription services, ffmpeg, do colorgrading, use the figma mcp, make remotion UI and render it.
I didn't touch a video editor.
We talk a lot about how important it is to set up self-verification loops. Especially in the age of powerful models that can run for long periods of time, self-verification is a key ingredient that enables the model to run for much longer, delivering a result that is closer to what you intended, so you can do more without having to constantly check in on Claude as it works.
@delba_oliveira gives a great breakdown of what that looks like and why it matters
Excited to share that MagicPath is now available as an official plugin for Codex, in collaboration with OpenAI!
It's incredibly easy to give Codex an infinite multiplayer canvas where it can design, build, and iterate with you.
@trq212@bcherny
I just submitted /feedback, but I keep getting tons of messages saying I'm violating CC's policies. It happens every single time I run workflows for long-running tasks, when it isn't another issue stalling the session first. It's been happening across multiple projects, running different things, and I'm obviously not doing anything wrong.
I used to run these same specs with /goal with zero issues. Here's what I noticed: when running workflows, if any sub-agent forgets to send the final notice, the session goes stale. So what I started doing was instructing the orchestrator to run /loop during workflows to check progress over time. That's when the policy violation errors started showing up.
And I mean it, I get this error every single time I use workflows with /loop after a few hours. This particular workflow had been running for around 12 hours before I got the policy violation error.
I love the new feature and would really appreciate if you guys could look into it.
Been teaching codex to be my QA assistant. For every commit it creates a user-test scenario and uses webVNC (crabbox), computer/browser use (peekaboo/mcporter) to test OpenClaw like a user/QA person would.
This runs in the background and opens PRs with fixes.
Excited to share our most powerful new Claude Code feature: dynamic workflows!
Mention "workflow" in a prompt and Claude will dynamically create an orchestration plan that it strictly follows, allowing you to confidently trust that every stage happens in the right order even across 100s of agents.
New in Claude Code (research preview): dynamic workflows.
Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks.
Use the word "workflow" in a prompt to get started.
@thehungrybird_@rauchg@cramforce Have a look at:
- https://t.co/aYHXwen8eO
- https://t.co/FXDeOMoiAO
- https://t.co/runpMvpUNP
When creating a new project Next.js automatically sets up an agents.md that is optimized for Next.js. Makes AI agents significantly better at Next.js tasks 👍
We've created the world's fastest PDF parser ⚡️
And it's more accurate than any other open-source, model-free PDF parser out there (pymupdf, pypdf, markitdown, pdftotext, opendataloader, pymupdf4llm)
Introducing LiteParse v2 - we rewrote the entire library into Rust and adapted it as native packages for Python and Node.
It supports 50+ different document types, can be triggered directly or installable directly within your favorite AI agent.
Blog: https://t.co/ckb0G73ESs
Repo: https://t.co/JNER0mVcB8
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.
On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Copy and paste this into your codex:
“Look through my recent Codex sessions and identify repeated workflows or repeated asks.
For anything I keep doing manually, suggest:
1. a skill if it is a reusable workflow
2. a custom subagent if it is a bounded role or investigation task
Focus on practical things like CI failures, PR reviews, changelogs, docs updates, release prep, debugging, and test triage.
Create the useful ones only. Keep them simple.”
Today we're open-sourcing Bumblebee, a read-only scanner for macOS and Linux.
It checks developer machines for risky packages, extensions, and AI tool configs.
Connected to Computer, it can trigger deeper scans whenever a new supply-chain risk emerges.
https://t.co/FOaWnF1yQy
I built an autotriage skill for codex that has a set of guidelines + reads VISION.md from my repos, so issues/prs that have a clear way of
- fit vision of the project
- being inferrable in code with high confidence
- clear fix
- can be live tested
Are now worked on autonomously. Codex can use a VM + computer vision (via https://t.co/2T5aNF5jTT , new parallels backend) to verify fixes, so it can work without interrupting me. I manually review suggestions. Since it was tedious to type in issues, I added an issue browser into https://t.co/NfEoHIQPil that parses common clipboard formats by codex so I can click through them conveniently.