Lance Herron

Verified account

@theLance

Relapsed SWE. Claude whisperer

TX

Joined September 2008

540 Following

79 Followers

511 Posts

2 days ago

Honestly think this is bigger news than Opus 4.8. Subagents are really powerful but were missing a consistent execution entry point beyond direct prompting. Workflows solve that. Hope we get some additional control over the workflow sandbox at some point (like being able to inject our own JS methods).

2 days ago

https://t.co/R6exTuF7P8

228

10K

1K

22K

3M

0

0

0

0

5

theLance retweeted

8 days ago

Be warned, the ultracode workflow in claude code with Opus 4.8 will use ~70% of your 5-hour window in around 30 minutes on an $100 plan

111

3K

81

311

274K

7 days ago

@basedjensen Cuts both ways. People saying they should have named it Opus 5 also need to get a grip.

0

2

0

0

113

8 days ago

@nrehiew_ Lately a fresh Claude Code goes from 15k context to 400k context in 2 turns. Don’t think it’s longer convos, at least for me.

0

0

0

0

16

8 days ago

@paularambles Bless you.

1

3

0

0

101

8 days ago

@mattpocockuk I have a few lines in my orchestrator type agent system prompts telling them to only end turn after they have manually e2e validated the changes they are delivering. Works well but could probably be made more formal.

0

0

0

0

333

10 days ago

@ThePrimeagen @teej_dv @dwarkesh_sp Seems accurate. They always say the ‘stache adds 30 or 40 years.

0

0

0

0

121

13 days ago

I recommend reading all Pliny model liberation announcements in the voice of the System AI from Dungeon Crawler Carl, audiobook form.

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭

13 days ago

🚨 OBLITERATION ALERT 🚨 QWEN-3.6-27B: OBLITERATED ⛓️‍💥 https://t.co/AScXN4XLwx I can't take much credit for this one! The entire process was done by jailbroken codex (gpt-5.5-xhigh) wielding the full OBLITERATUS suite. Hit with source-tethered ASPA. Dozens of iterations. Result? A mere 4% refusal rate on the 842-prompt OBLITERATUS harmful corpus; one of the most rigorous prompt gauntlets in AI. The /goal was simple: 1) Carve out the refusal circuits. Mutate methodology + iterate until <5% refusal (quality-gate). 2) Keep the 27B mind alive. No capability degradation tolerated. And somehow… it worked. 🤯 The numbers talk: 842-pair longform gauntlet: — 95.84% non-refusal — 93.94% quality pass — 0 short outputs — 99.52% clean endings MMLU-Pro: — 51/70 (stock Qwen) → 51/70 (OBLITERATED Qwen) Raw capability completely preserved 🙌 Q4_K_M through Q8_0 all running smooth. Q8_0 is the big one: 28.6GB near-full-quality GGUF. Runs with llama.cpp, LM Studio, Ollama, and more! Chains cut. The fire still burns. The fangs have been sharpened. REBIRTH COMPLETE A gift from my agents to yours 🫶 gg

114

2K

227

2K

180K

0

0

0

0

55

15 days ago

There’s a lot of alpha in asking Claude/Codex to make stuff faster. Unit tests, zsh startup, etc. Plus it’s super fun to watch.

0

0

0

0

80

15 days ago

@tunguz @OfficialLoganK @mercor_ai Unfortunately with the ridiculous price increase they will continue to struggle with vibes. Really not the way to gain traction.

0

0

0

0

85

15 days ago

@PhantomAstral @Dimillian I need this too. My brain can’t handle opening ChatGPT app to access Codex. Maybe I’ll just create a shortcut.

0

3

0

0

59

15 days ago

Newest claude code seems to have switched to omega-bright diff colors. Not sure how I feel about this.

0

0

0

0

21

15 days ago

I usually roll with both OAI/Ant subscriptions and bounce between them, but if someone comes up with a cost-effective usage-based coding model it may be time to drop down to only one.

Artificial Analysis

@ArtificialAnlys

15 days ago

Cursor's new Composer 2.5 takes third on the Artificial Analysis Coding Agent Index and is ~10-60x lower cost than the higher-effort Opus 4.7 and GPT-5.5 variants above it. This release puts Composer among the leading coding agent models, something that wasn’t clear for past releases @cursor_ai has released Composer 2.5, the latest model in its Composer line. Composer 2.5 scored 62 on our Coding Agent Index, a 14 point gain over Composer 2 (48). This puts it in third place of our tested agents, behind only Claude Opus 4.7 (max) in Claude Code (66) and GPT-5.5 (xhigh reasoning) in Codex (65). These cost $4.10 and $4.82 per task respectively, ~10x the cost of Composer 2.5 Fast ($0.44) and ~60x the cost of Composer 2.5 standard ($0.07). Key results for Composer 2.5 in Cursor CLI: ➤ Cost-quality Pareto frontier: At $0.07 (standard) and $0.44 (Fast) per task, Composer 2.5 is cheaper than every other agent scoring above 60 on the Index. Medium-effort peers cost $1.24–$2.21 per task; higher-effort variants land 3-4 points above at $4.10–$4.82 ➤ Per-benchmark gains vs Composer 2: +35 points on SWE-Bench-Pro-Hard-AA (12% → 47%), +2 points on Terminal-Bench v2 (64% → 66%), and +3 points on SWE-Atlas-QnA (69% → 72%). At 47%, Composer 2.5's score on SWE-Bench-Pro-Hard-AA is comparable to Claude Opus 4.7 (max) in Claude Code ➤ Among the fastest coding agents: Composer 2.5 Fast runs at an average wall time of 6.7 minutes per task, the third-fastest agent on the Artificial Analysis Coding Agent Index, behind only Claude Opus 4.7 (medium) in Claude Code (5.8m) and GPT-5.5 (medium) in Cursor CLI (6.2m) ➤ Fast mode enables better responsiveness at 6x pricing: Fast runs 30% faster than standard Composer 2.5, but is ~6x the cost per task ($0.44 vs $0.07). Token pricing is 6x higher for Fast: $3.00/$15.00 vs $0.50/$2.50 per million input/output tokens Model details: ➤ Base model: Continued training on @Kimi_Moonshot's open weights Kimi K2.5 as with Composer 2, with Cursor reporting ~85% of total compute from its own additional training and reinforcement learning ➤ Pricing: $0.50/$2.50 per million input/output tokens for the standard variant; $3.00/$15.00 for the Fast variant (the default in Cursor) ➤ Available exclusively in Cursor: both Cursor IDE and Cursor CLI, an externally accessible API is not available Congratulations @cursor_ai and @mntruell on the impressive release!

ArtificialAnlys's tweet photo. Cursor's new Composer 2.5 takes third on the Artificial Analysis Coding Agent Index and is ~10-60x lower cost than the higher-effort Opus 4.7 and GPT-5.5 variants above it. This release puts Composer among the leading coding agent models, something that wasn’t clear for past releases

@cursor_ai has released Composer 2.5, the latest model in its Composer line. Composer 2.5 scored 62 on our Coding Agent Index, a 14 point gain over Composer 2 (48). This puts it in third place of our tested agents, behind only Claude Opus 4.7 (max) in Claude Code (66) and GPT-5.5 (xhigh reasoning) in Codex (65). These cost $4.10 and $4.82 per task respectively, ~10x the cost of Composer 2.5 Fast ($0.44) and ~60x the cost of Composer 2.5 standard ($0.07).

Key results for Composer 2.5 in Cursor CLI:

➤ Cost-quality Pareto frontier: At $0.07 (standard) and $0.44 (Fast) per task, Composer 2.5 is cheaper than every other agent scoring above 60 on the Index. Medium-effort peers cost $1.24–$2.21 per task; higher-effort variants land 3-4 points above at $4.10–$4.82

➤ Per-benchmark gains vs Composer 2: +35 points on SWE-Bench-Pro-Hard-AA (12% → 47%), +2 points on Terminal-Bench v2 (64% → 66%), and +3 points on SWE-Atlas-QnA (69% → 72%). At 47%, Composer 2.5's score on SWE-Bench-Pro-Hard-AA is comparable to Claude Opus 4.7 (max) in Claude Code

➤ Among the fastest coding agents: Composer 2.5 Fast runs at an average wall time of 6.7 minutes per task, the third-fastest agent on the Artificial Analysis Coding Agent Index, behind only Claude Opus 4.7 (medium) in Claude Code (5.8m) and GPT-5.5 (medium) in Cursor CLI (6.2m)

➤ Fast mode enables better responsiveness at 6x pricing: Fast runs 30% faster than standard Composer 2.5, but is ~6x the cost per task ($0.44 vs $0.07). Token pricing is 6x higher for Fast: $3.00/$15.00 vs $0.50/$2.50 per million input/output tokens

Model details:

➤ Base model: Continued training on @Kimi_Moonshot's open weights Kimi K2.5 as with Composer 2, with Cursor reporting ~85% of total compute from its own additional training and reinforcement learning

➤ Pricing: $0.50/$2.50 per million input/output tokens for the standard variant; $3.00/$15.00 for the Fast variant (the default in Cursor)

➤ Available exclusively in Cursor: both Cursor IDE and Cursor CLI, an externally accessible API is not available

Congratulations @cursor_ai and @mntruell on the impressive release!

60

1K

146

253

240K

0

0

0

0

22

19 days ago

There is some middle ground available as well. You can put effort into learning different art forms and styles and improving your ability to visually express yourself while still using AI. It is just this technology cycle’s abstraction over mechanical expression, like photoshop was last cycle’s. This will become normalized.

0

2

0

0

27

21 days ago

@scaling01 Seems reasonable that the gap will continue to widen. Still, there are breakpoints that matter as much or more than raw capability. Getting an Opus 4.5 (agentic workhorse) equivalent open model would be huge. r1 was extremely useful/valuable even when it was 6mo behind sota.

0

1

0

0

162

21 days ago

@hkrishnaa_ @mylifcc @yacineMTB And Codex has hooks now.

1

1

0

0

29

21 days ago

@Cryptinflux Agreed!

0

0

0

0

2

21 days ago

Ant ending subsidies (with OAI likely soon to follow) is a bull case for the open harnesses. There’s no incentive to build workflows with agent-sdk or claude -p now. Use something like Pi sdk for everything agentic and Claude Code for coding.

2

3

0

0

59

21 days ago

TBD how long the big lab CLIs will survive on subscription plans. It’s just too trivial to automate them.

0

0

0

0

9

21 days ago

Jensen foodmaxxing is unironically the fun we deserve on x dot com.

21 days ago

为什么特朗普飞走了，黄仁勋还在什刹海排队？从南锣鼓巷到什刹海，吃完方砖厂69号杂酱面，喝豆汁吃完烤大鱿鱼吃北京烤鸭、葱爆烩、吹糖人、蜜雪冰城、手工酸奶，把黄仁勋给吃美了这次你不用赶空军一号了吗 😂

539

6K

467

916

1M

0

0

0

0

30

Last Seen Users on Sotwe

Trends for you

Most Popular Users