Samih @Samih_Kln - Twitter Profile

Samih_Kln retweeted

Ray🫧

@ravikiran_dev7

about 23 hours ago

POV: using Claude Fable 5 to rename variables

97

7K

407

704

357K

Samih @Samih_Kln

1 day ago

@matt__makes Going deeeep! Nice

1

0

23

Samih_Kln retweeted

Seva Ustinov

@sevaustinov

8 days ago

236

32K

2K

1M

Samih_Kln retweeted

JT

@jiratickets

22 days ago

karpathy pulling up to the office for his first day on the research team

67

15K

721

948

883K

Who to follow

Jon Lopez Garcia

@jonlpzgrc

Avoiding magic in software development at Payflip

React Native Learner

@LearnerReact

https://t.co/AzM4bbgwWp

Samih_Kln retweeted

22 days ago

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

8K

150K

11K

14K

28M

Samih_Kln retweeted

George Mack

@george__mack

25 days ago

The high agency triangle

90

25K

3K

10K

1M

Samih @Samih_Kln

26 days ago

@matt__makes Nice!

1

0

29

Samih_Kln retweeted

Nick Chapsas

@nickchapsas

about 1 month ago

claude - - model claude-opus-4-6, thank me later

32

211

6

98

38K

Samih_Kln retweeted

ClaudeDevs

@ClaudeDevs

about 2 months ago

Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116+ and we’ve reset usage limits for all subscribers.

2K

40K

3K

6K

6M

Samih @Samih_Kln

about 2 months ago

Didn't want to cancel but losing messages is just not acceptable. @claudeai

0

17

Samih @Samih_Kln

about 2 months ago

@claudeai Is chat compaction on the Desktop app just permanently deleting messages? 🥴I lost a lot of context I'd built up in a conversation. Is this expected behavior? It's a significant issue for my usage... Who's the right person to report this to? cc: @bcherny @trq212

Samih_Kln's tweet photo. @claudeai Is chat compaction on the Desktop app just permanently deleting messages? 🥴I lost a lot of context I'd built up in a conversation. Is this expected behavior? It's a significant issue for my usage... Who's the right person to report this to?
cc: @bcherny @trq212 https://t.co/DAEYAsD9qU

1

4

1

0

102

Samih_Kln retweeted

Anish Moonka

@anishmoonka

3 months ago

GPT-5.4 loses 54% of its retrieval accuracy going from 256K to 1M tokens. Opus 4.6 loses 15%. Every major AI lab now claims a 1 million token context window. GPT-5.4 launched eight days ago with 1M. Gemini 3.1 Pro has had it. But the number on the spec sheet and the number that actually works are two very different things. This chart uses MRCR v2, OpenAI’s own benchmark. It hides 8 identical pieces of information across a massive conversation and asks the model to find a specific one. Basically a stress test for “can you actually find what you need in 750,000 words of text.” At 256K tokens, the models are close enough. Opus 4.6 scores 91.9%, Sonnet 4.6 hits 90.6%, GPT-5.4 sits at 79.3% (averaged across 128K to 256K, per the chart footnote). Scale to 1M and the curves blow apart. GPT-5.4 drops to 36.6%, finding the right answer about one in three times. Gemini 3.1 Pro falls to 25.9%. Opus 4.6 holds at 78.3%. Researchers call this “context rot.” Chroma tested 18 frontier models in 2025 and found every single one got worse as input length increased. Most models decay exponentially. Opus barely bends. Then there’s the pricing. Today’s announcement removes the long-context premium entirely. A 900K-token Opus 4.6 request now costs the same per-token rate as a 9K request, $5/$25 per million tokens. GPT-5.4 still charges 2x input and 1.5x output for anything over 272K tokens. So you pay more for a model that retrieves correctly about a third of the time at full context. For anyone building agents that run for hours, processing legal docs across hundreds of pages, or loading entire codebases into one session, the only number that matters is whether the model can actually find what you put in. At 1M tokens, that gap between these models just got very wide.

76

2K

184

719

351K

Samih_Kln retweeted

Denys Khomyn

@denys_khomyn

3 months ago

@vonderleyen World War 3 happening live Europeans: “The war starts on Monday”

63

20K

750

481

620K

Samih_Kln retweeted

Rudrank @WWDC26

@rudrank

4 months ago

Using https://t.co/RFbQy0sop3 and https://t.co/x2HNeZd6YH in a project

32

3K

161

155

147K

Samih @Samih_Kln

4 months ago

@vasuman Underrated tweet

0

4

Samih_Kln retweeted

Jarred Sumner

@jarredsumner

4 months ago

@theo just need better integration tests. too many unit tests using mocks

34

634

6

53

45K

Samih @Samih_Kln

4 months ago

Agentic Engineering it is

Andrej Karpathy

@karpathy

4 months ago

A lot of people quote tweeted this as 1 year anniversary of vibe coding. Some retrospective - I've had a Twitter account for 17 years now (omg) and I still can't predict my tweet engagement basically at all. This was a shower of thoughts throwaway tweet that I just fired off without thinking but somehow it minted a fitting name at the right moment for something that a lot of people were feeling at the same time, so here we are: vibe coding is now mentioned on my Wikipedia as a major memetic "contribution" and even its article is longer. lol The one thing I'd add is that at the time, LLM capability was low enough that you'd mostly use vibe coding for fun throwaway projects, demos and explorations. It was good fun and it almost worked. Today (1 year later), programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny. The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software. Many people have tried to come up with a better name for this to differentiate it from vibe coding, personally my current favorite "agentic engineering": - "agentic" because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight. - "engineering" to emphasize that there is an art & science and expertise to it. It's something you can learn and become better at, with its own depth of a different kind. In 2026, we're likely to see continued improvements on both the model layer and the new agent layer. I feel excited about the product of the two and another year of progress.

644

9K

817

3K

1M

0

1

0

45