DanieI @Cervera - Twitter Profile

@MatthewBerman There's something to be said about quality over volume. The gap may close over time, but if there's one thing I'll grant incumbent engineers, it's that elegant software architecture is not going to be a strong suit of agents limited to holding ~50KB at a time.

1

3

0

371

Who to follow

PKI Consortium

@PKIConsortium

Trusted digital assets and communication for everyone and everything

Sensible Nutjob

@sensiblenutjob

A voice exposing liberal hypocrisy. America First patriot. My sarcastic takes on politics, Newsom fails, borders & media bias. Fix our house first! 🇺🇸

Vadims Podāns 🇱🇻

@Crypt32

Passionate PKI Expert, author of PowerShell PKI (PSPKI) tool. Microsoft MVP: Cloud and Datacenter Management

DanieI

@Cervera

about 2 months ago

@MatthewBerman @ThePrimeagen @ThePrimeagen is one of those guys I occasionally check in with for an idea of what a non-trivial percentage of engineers are probably thinking. He's not who I go to for the kind of vision of what tomorrow is about to look like.

1

0

100

DanieI

@Cervera

about 2 months ago

@rezoundous For large scale workloads that would otherwise span weeks when you need it in days, or months when you need it in weeks, yes. There are some rare cases when durability may be the requirement.

0

235

DanieI

@Cervera

about 2 months ago

Multiple stories. One screeenshot.

0

9

DanieI

@Cervera

about 2 months ago

Stakeholder: "Why is it so complicated?" *20% error rate is manageable if you have validation layers, review workflows and controls between AI output and your books. It’s not manageable without them. But the answer here isn’t to avoid AI in accounting. It’s to be deliberate about where the model sits in the workflow. There’s a meaningful difference between AI that drafts and surfaces, sitting inside a system with deterministic validation, audit trails and exception handling built in as core features, versus AI bolted onto a legacy system or accessed raw through an API with none of that infrastructure around it. * https://t.co/vSAHg51yEy

0

13

DanieI

@Cervera

3 months ago

Post-AI.

0

26

DanieI

@Cervera

3 months ago

Defensive design is a must.

Priyanka Vergadia

@pvergadia

3 months ago

🤯BREAKING: Alibaba just proved that AI Coding isn't taking your job, it's just writing the legacy code that will keep you employed fixing it for the next decade. 🤣 Passing a coding test once is easy. Maintaining that code for 8 months without it exploding? Apparently, it’s nearly impossible for AI. Alibaba tested 18 AI agents on 100 real codebases over 233-day cycles. They didn't just look for "quick fixes"—they looked for long-term survival. The results were a bloodbath: 75% of models broke previously working code during maintenance. Only Claude Opus 4.5/4.6 maintained a >50% zero-regression rate. Every other model accumulated technical debt that compounded until the codebase collapsed. We’ve been using "snapshot" benchmarks like HumanEval that only ask "Does it work right now?" The new SWE-CI benchmark asks: "Does it still work after 8 months of evolution?" Most AI agents are "Quick-Fix Artists." They write brittle code that passes tests today but becomes a maintenance nightmare tomorrow. They aren't building software; they're building a house of cards. The narrative just got honest: Most models can write code. Almost none can maintain it.

pvergadia's tweet photo. 🤯BREAKING: Alibaba just proved that AI Coding isn't taking your job, it's just writing the legacy code that will keep you employed fixing it for the next decade. 🤣

Passing a coding test once is easy. Maintaining that code for 8 months without it exploding? Apparently, it’s nearly impossible for AI.

Alibaba tested 18 AI agents on 100 real codebases over 233-day cycles. They didn't just look for "quick fixes"—they looked for long-term survival.

The results were a bloodbath:

75% of models broke previously working code during maintenance.

Only Claude Opus 4.5/4.6 maintained a >50% zero-regression rate.

Every other model accumulated technical debt that compounded until the codebase collapsed.

We’ve been using "snapshot" benchmarks like HumanEval that only ask "Does it work right now?"

The new SWE-CI benchmark asks: "Does it still work after 8 months of evolution?"

Most AI agents are "Quick-Fix Artists." They write brittle code that passes tests today but becomes a maintenance nightmare tomorrow. They aren't building software; they're building a house of cards.

The narrative just got honest: Most models can write code. Almost none can maintain it.

485

9K

2K

6K

2M

0

25

DanieI

@Cervera

3 months ago

This is genuinely interesting. Replaces a lot of old apps that have been difficult to find or as easily configure: - default: 80% screen brightness from sunrise to sunset. 40% screen brightness from sunset to sunrise, and 80% media volume. - Whenever I am within the vicinity inside a theater at GPS location 📌 dim phone to 10% and turn notifications off. - whenever I reach the office, set brightness to 60% and notifications on vibrate only.

0

1

0

17

DanieI

@Cervera

3 months ago

Mindfulness is key.

Christian Bolivar

@CBolivar_89

3 months ago

@MatthewBerman And It’s only the beginning. This is when meditation will be crucial for helping us ground ourselves. We haven’t evolved to process all this information so quickly.

0

10

0

608

0

36

DanieI

@Cervera

3 months ago

@MatthewBerman Primagean shared his thoughts on YouTube on just this phenomenon recently. Just pulled yet another all nighter myself, racing to get the next build done.

1

2

0

153

DanieI

@Cervera

3 months ago

Prognosis.

Mario Dian

@mariodian

3 months ago

@_kaitodev @karpathy He just removed the repo for minutes before I added data filtering. Code: https://t.co/OUgacHuqdC Demo: https://t.co/jZ5KqB2s5C

mariodian's tweet photo. @_kaitodev @karpathy He just removed the repo for minutes before I added data filtering.

Code: https://t.co/OUgacHuqdC
Demo: https://t.co/jZ5KqB2s5C https://t.co/QjNnE2EnZR

9

159

18

237

31K

0

1

0

37

DanieI

@Cervera

3 months ago

Oof.

Sayan

@sayan_wtf

3 months ago

At this rate everyone’s gonna have their own app and zero users.

553

11K

647

526

724K

0

22

DanieI

@Cervera

3 months ago

@Tobby_scraper Every day.

0

46

DanieI

@Cervera

3 months ago

@DavidOndrej1 You were the first I encountered anywhere who warned people "agents are the future."

0

10

DanieI

@Cervera

3 months ago

I believe openclaw provided the right formula. Customizable experience. Persistence and ownership over local deliverables reframing as collaborative personal advocate, doer and coach, not a tool. In a word, the re-embodiment of Personal Computing, restored and directionally opposite of cloud centralization that often takes agency away from consumers. This actually gives some back.

0

75

DanieI

@Cervera

3 months ago

All one can do is reduce scope of exposure. https://t.co/yHeUNcXG1c

0

25

DanieI

@Cervera

3 months ago

@MatthewBerman Ever consider two Claude Max accts on separate email addresses? May sound ridiculous, but I wonder how the math shakes out vs relying on API spend.

0

33

DanieI

@Cervera

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users