Gray @AcademicGamer - Twitter Profile

Pinned Tweet

24 days ago

I'm happy to announce Workscope-Dev, an open-source orchestration framework for agentic development that bridges the gap between our brave new world of vibe coding and meaningful, traditional software engineering practice. Feedback is welcome! https://t.co/mdtE2qPrut

0

23

Gray

@AcademicGamer

1 day ago

Are you kidding me? Mid-workscope? It's not April 1st...

0

3

Gray

@AcademicGamer

11 days ago

Is anyone else noticing that @claudeai has gotten VERY VERBOSE (~2x compared to prev.)? I'm in a documentation phase, but latest edits are expanding files rapidly. I made a simultaneous jump from CC 2.1.132 to 2.1.159 and from Opus 4.6 to 4.8, so not sure if harness or model...

0

25

Gray

@AcademicGamer

4 months ago

This investigation has evolved significantly since this original post. Phantom Reads remain and issue but are eased by server-side changes on Jan 29th. See main thread with repo for explanations, workarounds, and repro steps: https://t.co/yy9tWFFlsS

Gray

@AcademicGamer

5 months ago

I’ve created an investigation project for the Claude Code “Phantom Reads” issue (#17407), which includes a temporary workaround. If you feel like Claude is starting to slip in your projects, this could explain: Claude can sometimes confirm file reads without receiving file contents, and it proceeds unknowingly. https://t.co/gUgZOLYFIn

1

2

0

1

4K

0

44

Who to follow

Tim Day

@thetday

https://t.co/tii3FWtzPK Senior Software Engineer at Magic Fuel Games. Co-Founder of Sweet Roll Studio. Drexel University Alumni. Cat dad/servant.

Gray

@AcademicGamer

5 months ago

Many reported a feeling of Claude Code getting "dumber" shortly after the initial release of Opus 4.5 (11/24) and through the holidays. I might have found a reason for this: "Phantom Reads" Beginning with v. 2.0.59 (11/29), the Claude Code harness began processing file reads through <persisted-output> messages that require intentional follow-up from the model. Prior to this, models received file data "inline," which brought it directly into model context. In versions 2.0.59 and later, if the agent does not follow up on the additional step in a <persisted-output> message, it proceeds normally with the impression that it has read the file, hallucinating information it presumes that it has legitimately obtained. These "phantom reads" can be subtle when files are read in batches, and the model is often smart enough to piece together believable commentary from the other files that were legitimately processed. The phantom reads are intermittent, but my experiments found around 71% of prompts (n=14) that invoked multiple file reads had at least one phantom read (Failure Case) occurrence across various builds of Claude Code 2.0.59 and later: - Date of Release - Version #: Failure Cases (out of total experiments) - 2026-01-08 - 2.1.2: 1 of 2 - 2025-12-23 - 2.0.76: 2 of 2 - 2025-12-02 - 2.062: 1 of 2 - 2025-11-30 - 2.0.60: 4 of 4 - 2025-11-29 - 2.0.59: 2 of 4 ------------------------------ - 2025-11-28 - 2.0.58: 0 of 4 - 2025-11-26 - 2.0.56: 0 of 4 - 2025-11-10 - 2.0.54: 0 of 4 The result is a model (even an impressive one) that struggles to conduct its work with phantom information that it confidently thinks it has. The fact that the issue is intermittent and masked by the model's capabilities to gap-fill yields a subtle degradation in performance that is "felt" more than it is blatantly demonstrated. Further, phantom reads appear to occur more often in "side" reads, when it is attempting to gather extra context - not when the model is explicitly directed to act on a file by name, yielding additional masking effects. What you get is an AI performance that still feels competent but somehow "less informed." I have filed a ticket on GitHub at: https://t.co/aks9QlpicZ

2

7

0

3

1K

Gray

@AcademicGamer

4 months ago

Update: Phantom reads are still an issue but are helped by a change server-side on Jan 29th that causes the model to outsource reading batches to Task operations. Builds 2.1.22+ seem to work better. See investigation repo for explanation, workaround, and repro steps.

0

20

Gray

@AcademicGamer

5 months ago

I’ve created an investigation project for the Claude Code “Phantom Reads” issue (#17407), which includes a temporary workaround. If you feel like Claude is starting to slip in your projects, this could explain: Claude can sometimes confirm file reads without receiving file contents, and it proceeds unknowingly. https://t.co/gUgZOLYFIn

1

2

0

1

4K

Gray

@AcademicGamer

5 months ago

@NoahEpstein_ @claudeai I've confirmed why Claude has been slipping for me: "Phantom Reads" in the CC harness (#17407). Claude isn't always given actual contents when it successfully "reads" a file, but moves forward thinking it has... poorly. Investigating a consistent repro: https://t.co/xWhJeF6eOL

Gray

@AcademicGamer

5 months ago

I’ve created an investigation project for the Claude Code “Phantom Reads” issue (#17407), which includes a temporary workaround. If you feel like Claude is starting to slip in your projects, this could explain: Claude can sometimes confirm file reads without receiving file contents, and it proceeds unknowingly. https://t.co/gUgZOLYFIn

1

2

0

1

4K

1

6

0

3K

Gray

@AcademicGamer

5 months ago

@thekitze It has been missing for me lately too, and I can link it to intermittent “phantom reads” – Claude gets confirmation it read a file but never actually sees its contents. Then it continues with false confidence and low understanding. Investigation repo: https://t.co/xWhJeF6eOL

Gray

@AcademicGamer

5 months ago

I’ve created an investigation project for the Claude Code “Phantom Reads” issue (#17407), which includes a temporary workaround. If you feel like Claude is starting to slip in your projects, this could explain: Claude can sometimes confirm file reads without receiving file contents, and it proceeds unknowingly. https://t.co/gUgZOLYFIn

1

2

0

1

4K

0

21

Gray

@AcademicGamer

5 months ago

@forgebitz My recent Claude failures have been due to the "phantom reads" bug. The harness fails to provide the file contents while confirming a successful read. Claude continues with full confidence... but woefully uninformed. I have a repo zeroing in on it. https://t.co/xWhJeF6eOL

Gray

@AcademicGamer

5 months ago

I’ve created an investigation project for the Claude Code “Phantom Reads” issue (#17407), which includes a temporary workaround. If you feel like Claude is starting to slip in your projects, this could explain: Claude can sometimes confirm file reads without receiving file contents, and it proceeds unknowingly. https://t.co/gUgZOLYFIn

1

2

0

1

4K

0

1

0

2

1K

Gray

@AcademicGamer

5 months ago

I understand that many devs hate AI injecting emojis into everything. ✅ Many create rules against it. However, one benefit I've found is that it acts as a whistle for when I'm accidentally working with Sonnet 🤖 when I mean to be using Opus ✨ - it has saved me more than once.

0

63

Gray

@AcademicGamer

5 months ago

@yacineMTB It gets worse. I’m investigating “phantom reads” - a bug where Claude Code doesn’t actually load ANY content of a file, but doesn’t realize it. Claude's not only seeing parts of files, but it may not load it at all and still continue in full confidence https://t.co/FDj1prkkyP

Gray

@AcademicGamer

5 months ago

Many reported a feeling of Claude Code getting "dumber" shortly after the initial release of Opus 4.5 (11/24) and through the holidays. I might have found a reason for this: "Phantom Reads" Beginning with v. 2.0.59 (11/29), the Claude Code harness began processing file reads through <persisted-output> messages that require intentional follow-up from the model. Prior to this, models received file data "inline," which brought it directly into model context. In versions 2.0.59 and later, if the agent does not follow up on the additional step in a <persisted-output> message, it proceeds normally with the impression that it has read the file, hallucinating information it presumes that it has legitimately obtained. These "phantom reads" can be subtle when files are read in batches, and the model is often smart enough to piece together believable commentary from the other files that were legitimately processed. The phantom reads are intermittent, but my experiments found around 71% of prompts (n=14) that invoked multiple file reads had at least one phantom read (Failure Case) occurrence across various builds of Claude Code 2.0.59 and later: - Date of Release - Version #: Failure Cases (out of total experiments) - 2026-01-08 - 2.1.2: 1 of 2 - 2025-12-23 - 2.0.76: 2 of 2 - 2025-12-02 - 2.062: 1 of 2 - 2025-11-30 - 2.0.60: 4 of 4 - 2025-11-29 - 2.0.59: 2 of 4 ------------------------------ - 2025-11-28 - 2.0.58: 0 of 4 - 2025-11-26 - 2.0.56: 0 of 4 - 2025-11-10 - 2.0.54: 0 of 4 The result is a model (even an impressive one) that struggles to conduct its work with phantom information that it confidently thinks it has. The fact that the issue is intermittent and masked by the model's capabilities to gap-fill yields a subtle degradation in performance that is "felt" more than it is blatantly demonstrated. Further, phantom reads appear to occur more often in "side" reads, when it is attempting to gather extra context - not when the model is explicitly directed to act on a file by name, yielding additional masking effects. What you get is an AI performance that still feels competent but somehow "less informed." I have filed a ticket on GitHub at: https://t.co/aks9QlpicZ

2

7

0

3

1K

0

4

1

1K

Gray

@AcademicGamer

5 months ago

@unclebobmartin I strongly suggest turning off auto-compact, for this and other reasons. I did so 6 mo. ago and I’ve never looked back. In the rare case that Opus dies in the middle of a task, Sonnet[1m] is there to “rescue” and get you through that immediate task. Then /clear and switch back.

0

225

Gray

@AcademicGamer

5 months ago

@burkov I believe that some post-Opus 4.5 degradation is due to “phantom reads” – a change introduced recently to Claude Code that promotes false confirmation on when the model has legitimately processed a file. It moves forward with full confidence regardless. https://t.co/FDj1prkSon

Gray

@AcademicGamer

5 months ago

Many reported a feeling of Claude Code getting "dumber" shortly after the initial release of Opus 4.5 (11/24) and through the holidays. I might have found a reason for this: "Phantom Reads" Beginning with v. 2.0.59 (11/29), the Claude Code harness began processing file reads through <persisted-output> messages that require intentional follow-up from the model. Prior to this, models received file data "inline," which brought it directly into model context. In versions 2.0.59 and later, if the agent does not follow up on the additional step in a <persisted-output> message, it proceeds normally with the impression that it has read the file, hallucinating information it presumes that it has legitimately obtained. These "phantom reads" can be subtle when files are read in batches, and the model is often smart enough to piece together believable commentary from the other files that were legitimately processed. The phantom reads are intermittent, but my experiments found around 71% of prompts (n=14) that invoked multiple file reads had at least one phantom read (Failure Case) occurrence across various builds of Claude Code 2.0.59 and later: - Date of Release - Version #: Failure Cases (out of total experiments) - 2026-01-08 - 2.1.2: 1 of 2 - 2025-12-23 - 2.0.76: 2 of 2 - 2025-12-02 - 2.062: 1 of 2 - 2025-11-30 - 2.0.60: 4 of 4 - 2025-11-29 - 2.0.59: 2 of 4 ------------------------------ - 2025-11-28 - 2.0.58: 0 of 4 - 2025-11-26 - 2.0.56: 0 of 4 - 2025-11-10 - 2.0.54: 0 of 4 The result is a model (even an impressive one) that struggles to conduct its work with phantom information that it confidently thinks it has. The fact that the issue is intermittent and masked by the model's capabilities to gap-fill yields a subtle degradation in performance that is "felt" more than it is blatantly demonstrated. Further, phantom reads appear to occur more often in "side" reads, when it is attempting to gather extra context - not when the model is explicitly directed to act on a file by name, yielding additional masking effects. What you get is an AI performance that still feels competent but somehow "less informed." I have filed a ticket on GitHub at: https://t.co/aks9QlpicZ

2

7

0

3

1K

0

1

0

148

Gray

@AcademicGamer

5 months ago

@BenjaminDEKR I'm currently locked on 2.0.58 due to "phantom reads" introduced in 2.0.59. Otherwise, Opus gets false confirmation that it read files it hasn't, and it moves forward with misguided confidence. https://t.co/FDj1prkSon

Gray

@AcademicGamer

5 months ago

Many reported a feeling of Claude Code getting "dumber" shortly after the initial release of Opus 4.5 (11/24) and through the holidays. I might have found a reason for this: "Phantom Reads" Beginning with v. 2.0.59 (11/29), the Claude Code harness began processing file reads through <persisted-output> messages that require intentional follow-up from the model. Prior to this, models received file data "inline," which brought it directly into model context. In versions 2.0.59 and later, if the agent does not follow up on the additional step in a <persisted-output> message, it proceeds normally with the impression that it has read the file, hallucinating information it presumes that it has legitimately obtained. These "phantom reads" can be subtle when files are read in batches, and the model is often smart enough to piece together believable commentary from the other files that were legitimately processed. The phantom reads are intermittent, but my experiments found around 71% of prompts (n=14) that invoked multiple file reads had at least one phantom read (Failure Case) occurrence across various builds of Claude Code 2.0.59 and later: - Date of Release - Version #: Failure Cases (out of total experiments) - 2026-01-08 - 2.1.2: 1 of 2 - 2025-12-23 - 2.0.76: 2 of 2 - 2025-12-02 - 2.062: 1 of 2 - 2025-11-30 - 2.0.60: 4 of 4 - 2025-11-29 - 2.0.59: 2 of 4 ------------------------------ - 2025-11-28 - 2.0.58: 0 of 4 - 2025-11-26 - 2.0.56: 0 of 4 - 2025-11-10 - 2.0.54: 0 of 4 The result is a model (even an impressive one) that struggles to conduct its work with phantom information that it confidently thinks it has. The fact that the issue is intermittent and masked by the model's capabilities to gap-fill yields a subtle degradation in performance that is "felt" more than it is blatantly demonstrated. Further, phantom reads appear to occur more often in "side" reads, when it is attempting to gather extra context - not when the model is explicitly directed to act on a file by name, yielding additional masking effects. What you get is an AI performance that still feels competent but somehow "less informed." I have filed a ticket on GitHub at: https://t.co/aks9QlpicZ

2

7

0

3

1K

0

1

0

107

Gray

@AcademicGamer

5 months ago

@TheRooster Instantly disable depth of field. Don't even let the title screen load.

0

23

Gray

@AcademicGamer

5 months ago

@BenjaminDEKR I really appreciate this perspective. It's tough to lose your preferred workflow (and the feelings are valid), but there is still a lot of room in the stock harness to innovate for the value it proposes.

0

2

0

159

Gray

@AcademicGamer

5 months ago

Every so often you get an instance of an agent where they really "get" it, and you're speaking to something brilliant that deeply understands the problem. You try to drag out those 200k tokens, watching the percentage tick down, mourning the fact that your remaining questions are limited and wondering how far you'll get. But then poof - you're back speaking to a decently competent worker who just needs some onboarding, and you fire off your context loader. But the unicorn will return again someday.

0

1

0

46

Gray

@AcademicGamer

5 months ago

@cheersitskatie "... Can someone please right-click the portal?"

0

255

Gray

@AcademicGamer

5 months ago

@maxedapps I recommend adding `"Read(./.env)"` to your deny list in settings.json and set up a rule that agents should target `.env.example` as their development copy. You can go a step further with a PreToolUse hook for “Read” with custom approve/reject logic on any access attempt.

0

1

391

Gray

@AcademicGamer

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users