mars

about 4 hours ago

we comin 🚂

Morgan

@morganlinton

about 7 hours ago

Breaking: Origin - Cursor’s competitor to GitHub!!!

45

1K

32

154

89K

0

3

0

40

about 7 hours ago

@mntruell @SpaceX

0

434

marsBuilds retweeted

Tutort provides courses for working professionals in the field of Software Engineering with real-time interactive sessions.

about 10 hours ago

We're excited to join forces with @SpaceX to advance the frontier of useful AI. Expect significant improvements to Cursor soon.

925

22K

2K

1K

2M

Who to follow

Tutort Academy

@tutort_academy

Marlon Anthony

@MarlonAnthonyYT

Football video game Connoisseur ⚽🎮 45,000+ Subs on YouTube reppin' #RetroPES

Feanor - Rei dos RPG

@Feanor_RPG

Fan of JRPGs\RPGs Developer |Data Scientist @ibm Redator do @LevelUpNews_ Redator-chefe do @promogamesbr_

12 days ago

a big snag on the “@ mars” handle 🚀

eric zakariasson

@ericzakariasson

12 days ago

introducing cursor profiles! go claim your handle at https://t.co/6t5lg2jqvg

355

2K

96

725

706K

0

2

0

175

19 days ago

@tibor_tee @Starlink @tibor_tee no interruptions yet!

0

1

0

42

19 days ago

first time on a flight with @Starlink... i don't think i ever can go back (in the middle of the atlantic btw)

1

5

0

1

746

29 days ago

composer 2.5 is a huge improvement to composer 2 and my new daily driver it absolutely rips on speed

29 days ago

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

cursor_ai's tweet photo. Introducing Composer 2.5, our most powerful model yet.

It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions.

For the next week, we’re doubling the included usage of the model. https://t.co/N87ojcXlOC

928

13K

1K

3K

20M

0

9

0

373

about 1 month ago

you love to hear it

2

18

0

777

about 1 month ago

peep Cursor CLI 👀

Artificial Analysis

@ArtificialAnlys

about 1 month ago

Announcing the Artificial Analysis Coding Agent Index! Our new coding agent benchmarks measure how combinations of agent harnesses and models perform on 3 leading benchmarks, token usage, cost and more When developers use AI to code they’re choosing a model, but also pairing it with a specific harness. It makes sense to benchmark that combination to understand and compare performance. The Artificial Analysis Coding Agent Index includes 3 leading benchmarks that represent a broad spectrum of coding agent use: ➤ SWE-Bench-Pro-Hard-AA, 150 realistic coding tasks that frontier models struggle with, sampled from Scale AI’s SWE-Bench Pro ➤ Terminal-Bench v2, 84 agentic terminal tasks from the Laude Institute and that range from system administration and cryptography to machine learning. 5 tasks were filtered due to environment incompatibility ➤ SWE-Atlas-QnA, 124 technical questions developed by Scale AI about how code behaves, root causes of issues, and more, requiring agents to explore codebases and give text answers Analysis of results: ➤ Opus 4.7 and GPT-5.5 lead the Index: Opus 4.7 in Cursor CLI scores 61, followed closely by GPT-5.5 in Codex and Opus 4.7 in Claude Code at 60. GPT-5.5 in Cursor CLI follows at 58. ➤ Open weights models are competitive, but still trail the leaders: GLM-5.1 in Claude Code is the top open-weight result at 53, followed by Kimi K2.6 and DeepSeek V4 Pro in Claude Code at 50. These are strong results, but still meaningfully behind the top proprietary models. ➤ Gemini 3.1 Pro in Gemini CLI underperforms: Gemini 3.1 Pro in Gemini CLI scores 43, well below where Gemini 3.1 Pro sits on our Intelligence Index, highlighting that Gemini’s performance in Gemini CLI remains a relative weak spot for Google’s offering. ➤ Cost per task (API token pricing) varies >30x: Composer 2 in Cursor CLI is cheapest at $0.07/task, followed by DeepSeek V4 Pro in Claude Code at $0.35/task and Kimi K2.6 in Claude Code at $0.76/task. At the high end, GPT-5.5 in Codex costs $2.21/task, while GLM-5.1 in Claude Code costs $2.26/task. For both models this was contributed to by high token usage, and in GPT-5.5’s case by a relatively higher per token cost. ➤ Token usage varies >3x: GLM-5.1 in Claude Code uses the most tokens at 4.8M/task, followed by Kimi K2.6 at 3.7M/task and DeepSeek V4 Pro at 3.5M/task. GPT-5.5 in Codex uses 2.8M tokens/task, substantially more than Opus 4.7 in Claude Code at 1.7M/task. In GLM-5.1’s case, higher token usage, cost and execution time were partly driven by the model entering loops on some tasks. ➤ Cache hit rates remain high but vary materially: Cache hit rates range from 80% to 96% across combinations. Provider routing, harness prompt structure and cache behavior can materially change the economics of running the same model given cached inputs are typically <50% the API price of regular input tokens. ➤ Time per task varies >7x: Opus 4.7 in Claude Code is fastest at ~6 minutes/task, while Kimi K2.6 in Claude Code is slowest at ~40 minutes/task. This is contributed to by differences in average turns per task, token usage and API serving speed. Opus 4.7 had materially lower amount of turns to complete a task than all other models while Kimi K2.6 had the most. ➤ Cursor made real progress with Composer 2: Composer 2 in Cursor CLI scores 48, near the leading open-weight model results, while being the cheapest combination measured at $0.07/task. Cursor has stated Composer 2 is built from Kimi K2.5, showcasing they have made substantial post-training gains. This is just the start. We are planning to add additional agents (both harnesses and models). Let us know what you would like to see added next.

ArtificialAnlys's tweet photo. Announcing the Artificial Analysis Coding Agent Index! Our new coding agent benchmarks measure how combinations of agent harnesses and models perform on 3 leading benchmarks, token usage, cost and more

When developers use AI to code they’re choosing a model, but also pairing it with a specific harness. It makes sense to benchmark that combination to understand and compare performance.

The Artificial Analysis Coding Agent Index includes 3 leading benchmarks that represent a broad spectrum of coding agent use:
➤ SWE-Bench-Pro-Hard-AA, 150 realistic coding tasks that frontier models struggle with, sampled from Scale AI’s SWE-Bench Pro
➤ Terminal-Bench v2, 84 agentic terminal tasks from the Laude Institute and that range from system administration and cryptography to machine learning. 5 tasks were filtered due to environment incompatibility
➤ SWE-Atlas-QnA, 124 technical questions developed by Scale AI about how code behaves, root causes of issues, and more, requiring agents to explore codebases and give text answers

Analysis of results:
➤ Opus 4.7 and GPT-5.5 lead the Index: Opus 4.7 in Cursor CLI scores 61, followed closely by GPT-5.5 in Codex and Opus 4.7 in Claude Code at 60. GPT-5.5 in Cursor CLI follows at 58.

➤ Open weights models are competitive, but still trail the leaders: GLM-5.1 in Claude Code is the top open-weight result at 53, followed by Kimi K2.6 and DeepSeek V4 Pro in Claude Code at 50. These are strong results, but still meaningfully behind the top proprietary models.

➤ Gemini 3.1 Pro in Gemini CLI underperforms: Gemini 3.1 Pro in Gemini CLI scores 43, well below where Gemini 3.1 Pro sits on our Intelligence Index, highlighting that Gemini’s performance in Gemini CLI remains a relative weak spot for Google’s offering.

➤ Cost per task (API token pricing) varies >30x: Composer 2 in Cursor CLI is cheapest at $0.07/task, followed by DeepSeek V4 Pro in Claude Code at $0.35/task and Kimi K2.6 in Claude Code at $0.76/task. At the high end, GPT-5.5 in Codex costs $2.21/task, while GLM-5.1 in Claude Code costs $2.26/task. For both models this was contributed to by high token usage, and in GPT-5.5’s case by a relatively higher per token cost.

➤ Token usage varies >3x: GLM-5.1 in Claude Code uses the most tokens at 4.8M/task, followed by Kimi K2.6 at 3.7M/task and DeepSeek V4 Pro at 3.5M/task. GPT-5.5 in Codex uses 2.8M tokens/task, substantially more than Opus 4.7 in Claude Code at 1.7M/task. In GLM-5.1’s case, higher token usage, cost and execution time were partly driven by the model entering loops on some tasks.

➤ Cache hit rates remain high but vary materially: Cache hit rates range from 80% to 96% across combinations. Provider routing, harness prompt structure and cache behavior can materially change the economics of running the same model given cached inputs are typically <50% the API price of regular input tokens.

➤ Time per task varies >7x: Opus 4.7 in Claude Code is fastest at ~6 minutes/task, while Kimi K2.6 in Claude Code is slowest at ~40 minutes/task. This is contributed to by differences in average turns per task, token usage and API serving speed. Opus 4.7 had materially lower amount of turns to complete a task than all other models while Kimi K2.6 had the most.

➤ Cursor made real progress with Composer 2: Composer 2 in Cursor CLI scores 48, near the leading open-weight model results, while being the cheapest combination measured at $0.07/task. Cursor has stated Composer 2 is built from Kimi K2.5, showcasing they have made substantial post-training gains.

This is just the start. We are planning to add additional agents (both harnesses and models). Let us know what you would like to see added next.

125

2K

168

480

3M

4

41

1

2

6K

about 1 month ago

b u t t e r y 🧈

about 1 month ago

A new PR review experience is now available in Cursor 3. Take PRs from creation to merge, all in one place. You can see comments, diffs, commits, and review status to understand what changed and next steps. Navigate larger PRs more quickly with the file tree and changes picker.

92

1K

96

413

200K

0

64

about 2 months ago

built a TARS that chills on my desktop with the Cursor SDK

about 2 months ago

We’re introducing the Cursor SDK so you can build agents with the same runtime, harness, and models that power Cursor. Run agents from CI/CD pipelines, create automations for end-to-end workflows, or embed agents directly inside your products.

410

9K

819

5K

3M

0

3

0

122

about 2 months ago

@d4m1n harness is the moat baby

1

7

0

1

1K

about 2 months ago

cursor SDK is a huge unlock for builders

about 2 months ago

We’re introducing the Cursor SDK so you can build agents with the same runtime, harness, and models that power Cursor. Run agents from CI/CD pipelines, create automations for end-to-end workflows, or embed agents directly inside your products.

410

9K

819

5K

3M

3

20

1

655

about 2 months ago

asyncmaxxing

about 2 months ago

Introducing /multitask in the new Cursor 3 interface. Cursor can now run async subagents to parallelize your requests instead of adding them to the queue. For already queued messages, you can ask Cursor to multitask on them instead of waiting for the current run to finish.

163

3K

191

651

428K

1

14

0

1

722

marsBuilds retweeted

Michael Truell

@mntruell

about 2 months ago

Excited to partner with the SpaceX team to scale up Composer. A meaningful step on our path to build the best place to code with AI.

480

10K

1K

444

2M

about 2 months ago

the harness matters folks

Jediah Katz

@jediahkatz

2 months ago

Independent analysis from @EndorLabs found that Cursor is the best harness for functional and secure code. Big improvement this week with Opus 4.7. 🔒

jediahkatz's tweet photo. Independent analysis from @EndorLabs found that Cursor is the best harness for functional and secure code. Big improvement this week with Opus 4.7. 🔒 https://t.co/mL2hAya5FT

25

286

39

52

91K

0

1

0

106

2 months ago

data centers seeing this

2 months ago

Claude Opus 4.7 is now available in Cursor. We've found it to be impressively autonomous and more creative in its reasoning. We're launching it with 50% off for a limited time. Enjoy!

211

6K

347

445

546K

0

53

2 months ago

visualnovelmaxxing

2 months ago

Cursor can now respond by creating interactive canvases to visually represent information. Ask it to generate dashboards and custom interfaces that are richer than plain text.

85

2K

157

657

221K

0

72