Perry Rajnovic @psylum - Twitter Profile

22 days ago

@garybernhardt i’ve always liked the wordpress motto of “code is poetry.” felt apt to those of us trying to do it the right way.

0

72

Perry Rajnovic @psylum

22 days ago

@DoorDash ya know, for all the money you make, you should really make a dark mode friendly loading screen. even nonprofits pull that off.

1

0

4

psylum retweeted

Patrick Ruffini

@PatrickRuffini

about 1 month ago

UPDATE: If high school were parliament, using @davidshor polling w/coalitions by Claude

2

23

4

10

14K

psylum retweeted

Pittsburgh Post-Gazette @PittsburghPG

about 1 month ago

Widespread use of public transit during the draft the 'biggest surprise' for event planners https://t.co/9gbDIs1UQR

6

37

6

2

15K

Who to follow

Atilla DeMunz

@demunz

Life’s a garden - dig it!

Apanthropy

@MediumLiteracy

Would you mind holding my interest for me for just a second? Thanks.

pugsly

@pugslyyyyyy

fuck is your problem?

psylum retweeted

no context memes

@nocontextmemes

about 1 month ago

100

21K

2K

685K

psylum retweeted

Pittsburgh Penguins

@penguins

about 2 months ago

🤫 @Max25talbot

62

3K

259

59

120K

psylum retweeted

Aakash Gupta

@aakashgupta

3 months ago

41% of all code shipped in 2025 was AI-generated or AI-assisted. The defect rate on that code is 1.7x higher than human-written code. And a randomized controlled trial found that experienced developers using AI tools were actually 19% slower than developers working without them. Devs have always written slop. The entire software industry is built on infrastructure designed to catch slop before it ships. Code review, linting, type checking, CI/CD pipelines, staging environments. All of it assumes one thing: the person who wrote the code can walk you through what it does when the reviewer asks. That assumption held for 50 years. It broke in about 18 months. When 41% of your codebase was generated by a machine and approved by a human who skimmed it because the tests passed, the review process becomes theater. The reviewer is checking code neither of them wrote. The linter catches syntax, not intent. The tests verify behavior, not understanding. The old slop had an owner. Someone could explain why temp_fix_v3_FINAL existed, what edge case it handled, and what would break if you removed it. The new slop has an approver. Different relationship entirely. Arvid’s right that devs wrote bad code before AI. The part he’s missing: the entire quality infrastructure of software engineering was designed around a world where the author and the debugger were the same person. That world ended last year and nothing has replaced it yet.

181

2K

239

692

557K

psylum retweeted

Alex Prompter

@alex_prompter

3 months ago

🚨BREAKING: Alibaba tested AI coding agents on 100 real codebases, spanning 233 days each. the agents failed spectacularly. turns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI collapses. SWE-CI is the first benchmark that measures long-term code maintenance instead of one-shot bug fixes. each task tracks 71 consecutive commits of real evolution. 75% of AI models break previously working code during maintenance. only Claude Opus 4 stays above 50% zero-regression rate. every other model accumulates technical debt that compounds over iterations. here's the brutal part: - HumanEval and SWE-bench measure "does it work right now" - SWE-CI measures "does it still work after 6 months of changes" agents optimized for snapshot testing write brittle code that passes tests today but becomes unmaintainable tomorrow. Alibaba built EvoScore to weight later iterations heavier than early ones. agents that sacrifice code quality for quick wins get punished when consequences compound. the AI coding narrative just got more honest: most models can write code. almost none can maintain it.

alex_prompter's tweet photo. 🚨BREAKING: Alibaba tested AI coding agents on 100 real codebases, spanning 233 days each.
the agents failed spectacularly.

turns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI collapses.

SWE-CI is the first benchmark that measures long-term code maintenance instead of one-shot bug fixes.
each task tracks 71 consecutive commits of real evolution.

75% of AI models break previously working code during maintenance.
only Claude Opus 4 stays above 50% zero-regression rate. every other model accumulates technical debt that compounds over iterations.

here's the brutal part:

- HumanEval and SWE-bench measure "does it work right now"
- SWE-CI measures "does it still work after 6 months of changes"

agents optimized for snapshot testing write brittle code that passes tests today but becomes unmaintainable tomorrow.

Alibaba built EvoScore to weight later iterations heavier than early ones. agents that sacrifice code quality for quick wins get punished when consequences compound.

the AI coding narrative just got more honest: most models can write code. almost none can maintain it.

181

3K

529

2K

710K

Perry Rajnovic @psylum

4 months ago

@liz my MbP is an Intel '19, i've definitely noticed lots more lag in unlock and icon re-caching with macOS 26. but i do appreciate the extra level of greyscale i can have in the OS chrome. but i'll probably pull the trigger on a new one this year anyways.

0

35

psylum retweeted

Pittsburgh Penguins

@penguins

4 months ago

The ones that get it, get it 🤝

18

12K

927

307

258K

psylum retweeted

Evan DeSimone @Smorgasboredom

5 months ago

At least once a week i think of that tweet about how millennials are the one generation born after they took lead out of paint and before smartphones.

163

43K

4K

3K

1M

psylum retweeted

Vintage Maps

@vintagemapstore

5 months ago

Genders of France’s rivers in French

67

10K

614

993

2M

psylum retweeted

Scott Nover @ScottNover

5 months ago

The Pittsburgh Post-Gazette is shutting down.

11

159

113

23

234K

psylum retweeted

Preston Thorpe @PThorpe92

5 months ago

Real nerds write _everything_ in markdown by default, regardless of whether it will be rendered, knowing that the target audience can be trusted to mentally render it while reading.

61

4K

121

220

101K

Perry Rajnovic @psylum

6 months ago

@Citymapper your listing of mta issues is excluding planned skipping of stops, which is sort of important (northbound R in BK)

0

14

Perry Rajnovic @psylum

7 months ago

@SwarmApp latest update definitely addressed this, thanks!

0

1

0

19

Perry Rajnovic @psylum

7 months ago

@SwarmApp latest update is running extra slow on the iphone 13 mini. still fully supported, hop this isn’t a neglect of older devices coming to pass.

1

0

72

Perry Rajnovic @psylum

7 months ago

@SwarmApp also, not perf related, but i went to a sporting event recently and the “you’ve seen Team A x times and Team B y times” had the x/y values swapped.

0

1

0

20

Perry Rajnovic @psylum

7 months ago

@SwarmApp there does also seem to be a lag before i see that i have notifications badged after starting the app.

0

1

0

20

Perry Rajnovic @psylum

9 months ago

@AndrewSolender 4 or 1 are my choices. if you're taking 5 with nobody in 4, you're a jerk blocking a seat, and if you're taking it with someone already there, most people lean away and give you space anyhow.

0

4

Perry Rajnovic

@psylum

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users