Bennett Brownlow

about 8 hours ago

@jediahkatz Yes, do /goal next

0

3

0

35

bennett2b retweeted

about 10 hours ago

https://t.co/0PIwVLwUrC

17

517

27

720

39K

bennett2b retweeted

Niklas Muennighoff @Muennighoff

1 day ago

cursor in slack can now read documents attached in the thread, including .txt, .log, .json, .zip, .pdf, or .docx files!

9

221

15

38

15K

bennett2b retweeted

John Bai

@johnbai

2 days ago

Here's a step by step: 1. Open the in-app browser to select, scribble, and describe changes to any component in design mode. 2. Type /multitask or queue changes to run them in parallel — great for touching several unrelated components at once. 3. Get your steps in, grab a coffee, text mom to just say hi.

7

212

14

219

24K

bennett2b retweeted

xAI

@xai

2 days ago

Composer 2.5 is now available inside Grok Build. Composer 2.5 is a fast, highly intelligent model that excels on long-running tasks and following complex instructions.

505

6K

701

992

16M

bennett2b retweeted

5 days ago

Cursor Composer 2.5 at #1 on an independent eval by @nextjs ahead of Opus 4.8 (https://t.co/da8wdcruGd)

26

294

25

40

38K

bennett2b retweeted

Paul Bakaus

@pbakaus

5 days ago

composer 2.5 is, as of today, the best way to visually iterate on frontends. i was frankly surprised by this, but the speed/quality/value ratio of the model and harness is insanely good. cursor is back!

31

435

14

292

53K

5 days ago

Enterprise Security teams can be... risk averse. Very excited for Auto-review to help unblock end users and give Security teams peace of mind as agents take on more ambitious tasks.

Cursor @cursor_ai

5 days ago

Auto-review mode is now available in Cursor. It allows agents to run tool calls with fewer approval prompts and safer execution.

89

2K

134

453

254K

0

14

0

1

1K

bennett2b retweeted

6 days ago

grok build is available in cursor, try it out!

86

1K

35

65

68K

bennett2b retweeted

6 days ago

introducing thermos in cursor a deep security/correctness audit and a harsh code quality audit, run in parallel on your branch, synthesized into one prioritized list

34

1K

59

599

43K

bennett2b retweeted

Lee Robinson

@leerob

6 days ago

How are coding agents changing software engineering? Yapped for 15 minutes about new Cursor data we published, including: 1. Why lines of code is an imperfect measure of AI progress 2. Balancing intelligence/cost/speed for models 3. Code reviews with "Mega PRs" (1000+ lines)

53

808

53

610

57K

bennett2b retweeted

Sonya Huang 🐥

@sonyatweetybird

8 days ago

Today's Training Data episode takes us BTS on the infrastructure challenges required to do large RL runs at scale, featuring @ellev3n11 (Composer Lead at @cursor_ai) and @dzhulgakov (Co-Founder at @FireworksAI_HQ). The Cursor team trained Composer 2 on Fireworks by starting with a strong base model (Kimi 2.5) and performing large-scale mid-training on code tokens and web data to learn common patterns and libraries, followed by a large-scale Reinforcement Learning run to learn how to navigate the Cursor harness, call tools, and write correct code. Today's episode dives into the systems and infrastructure challenges of making that large RL run happening, and there were many (!!), from numerical mismatch to global distribution to synchronizing rollouts across asynchronous pipelines to keeping track of expert activation across runs and more. Extremely nerdy in-the-weeds challenges that Federico and Dima were delighted to nerd out on together :) Beyond RL infra, we also discussed Online vs Simulated rollouts, self-summarization for long-horizon agents, environment design ("the most powerful RL environment is the product itself"), and other technical nuggets. PS: We filmed this episode before the SpaceX news, while the Cursor team was still compute-constrained. While Cursor now has *all* the flops, the takeaways and hurdles crossed ring true for any serious application-level company that is racing to post-train their own models. I believe that more serious application companies will go the way of Cursor and post-train their own models. 00:00 Introduction 00:53 Why Cursor Trained Composer 2 04:55 Specialization vs Bitter Lesson 06:16 Composer 2 Training Recipe 16:32 Scaling RL Infrastructure Globally 23:32 Floating Point Drift 25:11 MoE Sensitivity Explained 26:25 Router Replay Fix 27:19 Real Time RL Loop 31:49 Long Horizon Agents 34:29 Why RL Everywhere 37:34 LLM as Judge Rewards 39:14 RL in Hard Domains 40:13 Build Your Own Environments 44:34 Closing Thoughts

9

333

28

428

75K

9 days ago

Every enterprise I speak with is trying to go poteto-mode (they often mistakenly call it something else). Lauren has a great playbook that every team will benefit from. It’s been very cool to see pstack gain traction internally at Cursor.

lauren

@poteto

9 days ago

https://t.co/XLwSDQA3d8

45

1K

106

2K

262K

1

28

0

7

2K

12 days ago

@IrrationalShuma It will pull from the same pool of usage as your Cursor account.

0

3

0

31