Lara Chelak

@larachelak

Founder, Engineer, CEO: @codespeedai

New York, NY

Joined January 2009

1.4K Following

1.6K Followers

2.8K Posts

larachelak retweeted

Cursor @cursor_ai

1 day ago

Three announcements from our keynote at Compile, including how we're training a new model with SpaceX.

236

747

larachelak retweeted

Andy Matuschak

@andy_matuschak

4 days ago

Of my friends who use coding agents heavily, the happiest seem to fall into two distant camps: a) Controlled fast loops: 1-2min cycles, mostly single-threaded, still totally in control of the code, using the agent "to type faster" b) Delegated slow loops: Nudging along in the background a couple times a day, while something else (design, writing, etc) is their primary focus; paying ~no attention to code; it's fine if agents get stuck for hours Like many, I've been trying to make some middle ground work: trying to delegate more than the first camp, while giving the work more focus and technical oversight than the second. My role is planning, technical guidance, and code review. This leads to 10-30min cycles, which lead to parallelism (to avoid busy-waiting), which leads to context-switching and fragmentation, which leads to working memory churn and poor comprehension, which leads to situations where neither the agents nor I understand what's going on. It sucks and I hate it. I'm moving faster, but the work is unpleasant and unrewarding. It seems awfully hard to exert *partial* technical control—much easier to exert ~full or ~none. The ideal, maybe, is something like what @simonlast outlines in https://t.co/M6R3EF5Yy7: teams of agents making technical plans, reviewing each others' work, autonomously and adversarially testing, etc. You still get robust "technical oversight"—just by other agents. Unfortunately in my domain (mobile interface with heavy gestures and animation) that's not yet tractable, even with lots of homegrown scaffolds and probes. But things will probably look very different in a year. I'm curious if others have found happy middle ground between these poles?

878

894

117K

larachelak retweeted

a16z @a16z

8 days ago

Cursor CEO Michael Truell on the future of writing code: "Our goal with Cursor is to invent a new type of programming." "It looks like a world where you have a representation of the logic of your software that does look more like English." "You can imagine kind of an evolution of programming language towards pseudocode. You have written down the logic of the software, and you can edit that at a high level." "It won't be the impenetrable millions of lines of code, it'll instead be something that's much terser and easier to understand and easier to navigate." @mntruell with @lennysan on Lenny's Podcast

135

399

larachelak retweeted

Nick Dobos

@NickADobos

8 days ago

Lmfao cursor origin supports 22.6 commits a second (in a single repo)

411

414K

Who to follow

Andrew Babalola

@AndrewBabalola4

Offensive Lineman at the University of Michigan | Inquiries: [email protected]

Dr. Sherry Walling

@drsherrywalling

Psychologist to entrepreneurs. Author 2x. Podcaster. Speaker. Advocate for psychedelic therapy. Contributor to @Entrepreneur. Lover of circus. In grief.

GreenSmoothieGirl

@Robyn_Openshaw

Overthrowing the Standard American Diet one green smoothie at a time—and standing up to Big Pharma and Big Food. #greensmoothiegirl #thegreensmoothiegirl

larachelak retweeted

swyx 🔜 @aiDotEngineer

@swyx

8 days ago

Cursor/Graphite’s @TomasReimers just announced Origin @cursor_ai’s long awaited Git competitor, scalable for agent workloads, extensible with api and mcp, and built in merge conflicts and co failure agent resolution

swyx's tweet photo. Cursor/Graphite’s @TomasReimers just announced Origin

@cursor_ai’s long awaited Git competitor, scalable for agent workloads, extensible with api and mcp, and built in merge conflicts and co failure agent resolution https://t.co/zdbDKS7Ehx

118

141

677

394K

larachelak retweeted

John Suh

@john_ssuh

12 days ago

Increasingly, I believe companies may need to be rebuilt from the ground up, where you have a single timeline of all observability + product metrics + file changes laid out in a retrievable system, like Datadog + Posthog + Google Drive + Slack (really unified filesystem of Claude Code chats + Codex chats). This might be the new data foundation for any and all companies to maximize AI. Needs to be rebuilt because keeping track of diffs on existing system basically impossible to produce longitudinal information on decisions and rollbacks, something coding agent storage companies are actively trying to figure out, but this should extend to businesses as a whole. Highly skeptical existing businesses will adopt this though because it means overhauling everything about their instrumentation and business data, but I think businesses built on this foundation probably can execute 100x better and faster

210

180

630K

larachelak retweeted

Pietro Schirano

@skirano

13 days ago

You should basically never use Fable for coding, but instead use it as a planner/orchestrator. Most of today's advanced models can implement a spec perfectly, and once done you can send the work to Fable to review. This has been my most powerful flow so far.

166

116

216K

Lara Chelak

@larachelak

14 days ago

And yet again, back to Claude with Fable 5. And somehow, I don't think the constant turning of tables, the mind-boggling innovation, will ever end. How lucky we are to be alive during this time.

larachelak retweeted

Andrej Karpathy

@karpathy

15 days ago

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

25K

larachelak retweeted

Cognition @cognition

16 days ago

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

cognition's tweet photo. Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers.

Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

242

317

larachelak retweeted

a16z @a16z

18 days ago

Mira Murati says human-AI collaboration needs models that can listen while they think: "The types of models that we work with today, they're very turn-based. You talk, they talk, then they go off and think." "While they're thinking, it's almost like they're deaf and blind. They cannot perceive anything else about what's going on." "By contrast, our interactions with each other are very rich. There is a lot of information in our interactions when we are silent, when we're thinking, when we're interrupting one another." "Interaction models are able to capture all of this nuance. They're not turn-based. They're more like time-based interaction, where they're continuously taking in audio, text, video, and continuously providing output." "This enables you to catch things like interruptions and simultaneous speech, and really create a rich, high bandwidth interaction between humans and machines." @miramurati at Bloomberg Tech live with @emilychangtv

157

205

larachelak retweeted

Michael Sidofsky

@mindzeye

16 days ago

The light of New York

286

516

150K

larachelak retweeted

Guillermo Rauch

@rauchg

24 days ago

Unclear if a durable trend, but CEOs and CTOs are back to coding with a fury, thanks to coding agents. I have public company CEOs sliding into my DMs (and “InMail”) telling me about falling in love with shipping software again thanks to Claude Code and Vercel. “Dream accounts” that we always wanted to work with, where in the past the C-suite would hardly understand the infrastructure until much later in the game. Coding agents are the ultimate PLG-fication of the enterprise. Bad, legacy software can’t hide anymore. The stack that works is self-evident to the entire organization, from intern to CEO.

200

287

380K

larachelak retweeted

Ivan Burazin

@ivanburazin

26 days ago

You can't sell to enterprise clients well without having been an enterprise buyer yourself. When I was building my first company, I couldn't wrap my head around why anyone would pay $100,000 for annual software contracts, let alone $1,000,000. Then I sat on the other side of the table after one of my cos was acquired. When I was the one approving those contracts, I immediately understood why companies pay that and what exactly they need to see before they do.

107

21K

larachelak retweeted

Jamon

@jamonholmgren

28 days ago

Agentic coding tip Pay down tech debt immediately With agents, there should be no such thing as “tech debt”. An agent should simply pay down every bit of tech debt before presenting you with the “finished work”. Unlike humans, agent time is not very valuable; it can and should continue to work on something until it’s done, and shouldn’t make concessions assuming human constraints. If an agent tells you that it’s “leaving something for later”, tell it to go finish it first before saying it’s done. Example prompt (best if it is in your docs): ``` Do not leave any tech debt behind. If you have taken any shortcuts, go back and do them right. This is a hard acceptance criteria that must be completed. ```

703

65K

larachelak retweeted

DHH

@dhh

about 1 month ago

For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.

201

212

364

527K

larachelak retweeted

Howard Lerman

@howard

about 1 month ago

Everyone is obsessed with AI making a 10x engineer a 1000x engineer. The recent reductions at CloudFlare and Click have me me realize the plot is equally about the inverse: AI amplifies the *negative* impacts of poor performers. If a person with poor taste, who makes mediocore judgement calls, and doesn't properly build things customers love is able to produce 10x more work - does a company want that? Hell no! Productivity isn't just about as many people as possible tokenmaxxing. AI is a double edged sword, especially when it's used to produce net new work. If you give a bad artist a pen that can draw 100x as fast, you're going to pile up with a lot of junky artwork very quickly. And since it happens so quickly leaders are now able to see quickly who is Picasso and who is not and adjust accordingly.

107

643

333K

larachelak retweeted

Ronin

@DeRonin_

about 1 month ago

Andrej Karpathy: "90% of Claude's mistakes come from missing context, not a weak model." 41% mistake rate without a CLAUDE.md. 11% with the 4-rule baseline. 3% with the 12-rule version below here are the 12 rules senior engineers settled on: 1. think before coding: state assumptions, don't guess. the model can't read your mind, stop hoping it will 2. simplicity first: minimum code, no speculative abstractions. the moment you let Claude add "for future flexibility," you've added 200 lines you'll delete next quarter 3. surgical changes: touch only what you must. don't let it improve adjacent code, that's how PRs blow up 4. goal-driven execution: define success criteria upfront, loop until verified. without them Claude either loops forever or stops too early 5. use the model only for judgment calls: classification, drafting, summarization, extraction. NOT routing, retries, status-code handling, deterministic transforms. if code can answer, code answers 6. token budgets are not advisory: per-task 4000, per-session 30000. by message 40 of a long debug, Claude is re-suggesting fixes you rejected at message 5 7. surface conflicts, don't average them: two patterns in the codebase? pick one. Claude blending them is how errors get swallowed twice 8. read before you write: read exports, callers, shared utilities. Claude will happily add a duplicate function next to an identical one it never read 9. tests verify intent, not just behavior: a test that can't fail when business logic changes is wrong. all 12 of Claude's tests can pass while the function returns a constant 10. checkpoint every significant step: Claude finished steps 5 and 6 on top of a broken state from step 4. nobody noticed for an hour 11. match the codebase conventions: class components? don't fork to hooks silently. testing patterns assumed componentDidMount, hooks broke them without surfacing 12. fail loud: "completed successfully" with 14% of records silently skipped is the worst class of bug. surface uncertainty, don't hide it what actually compounds instead of the next framework: - the CLAUDE.md file as institutional memory across sessions - eval-driven changes, not vibe-driven - checkpoints over speed - explicit conflicts over silent blending - discipline over framework, every time - one repo, one rules file, no exceptions be a few rules ahead of AI twitter before this becomes mass-opinion study this

355

448K

larachelak retweeted

Yash Patil

@ypatil125

about 1 month ago

Exactly! The winning strategy is not betting on who has the best model this month. It is building the deployment layer where intelligence actually compounds. That means serving the best possible agent tokens on durable infrastructure: route to any model, train your own when it makes sense, and own the context, harness, environment and interfaces around the agent. Applied Compute is building this customer-first deployment layer. We help customers build intelligent systems where the value compounds on their side.

173

130

33K

larachelak retweeted

Ivan Burazin

@ivanburazin

about 1 month ago

No one's talking about how sandbox forking is going to change how multi-agent handoffs work. Right now, when one agent hands work to another, you destroy the VM and start fresh. The context, file state, environment, etc are gone. With forking, you can just instantly clone the running sandbox, and the next agent picks up from exactly where the last one left off.

Lara Chelak

@larachelak

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users