Eric MacIver @ericmaciver - Twitter Profile

Pinned Tweet

10 days ago

24 hours of Fable 5 auditing the app I've spent 4 months building with Opus 4.5 and 4.6 - 553K LOC, ~4,500 tests. I spec every feature. I run adversarial reviews and blind agent reviews on every implementation round. It still found 10 P0s. The worst: my AI code reviews were approving code they couldn't actually read. And 30 test failures were invisible because the gate command never ran them. Code generation is solved. Verification isn't. I'm building for exactly that - verification that doesn't depend on trusting any single model.

1

0

71

Eric MacIver

@EricMaciver

3 days ago

@naman_jain28 @arvidkahl 100% - I run both now - failing test first, then a second set of eyes on the diff that didn't write the code. Different failure modes.

0

11

Eric MacIver

@EricMaciver

5 days ago

@arvidkahl Congrats! I haven't been active for very long, but enjoying the posts. Ramble on!

0

1

0

10

Eric MacIver

@EricMaciver

7 days ago

After having Fable access for a few days, adjusting workflows to maximize its capability - then having to revert to Opus (seeking the same quality results), my usage is higher than the workflows I developed with Fable orchestrating. Anyone else finding the same?

0

18

Who to follow

Murray Resources

@MurrayResources

Murray Resources is an award-winning recruiting firm based in Houston, Texas. We match dynamic employers with talented professionals across the United States.

Building @corkprotocol Advisor & co-founder @toucanprotocol, advisor @returnprotocol, @raizfarm, @NeutralExchange Prev CEO Vultus (ipo)

Eric MacIver

@EricMaciver

8 days ago

Hey Fable, why does my cat lick its toes? Model not available? Hmmm. Ok. /ultracode why does my cat lick its toes? Yes, I need a /codex:adversarial on that.

0

25

Eric MacIver

@EricMaciver

8 days ago

@arvidkahl Ha! Even better.

0

15

Eric MacIver

@EricMaciver

8 days ago

Inverse Aesop moral as well: a watchdog that never barks. I audited 4 months of my AI-written code with Fable this week and the biggest issue wasn't the bugs - it was that my multi-model AI reviews had been approving code they couldn't parse. 30 test failures were invisible because the gate command never ran them - green checkmarks the whole way. The loop compounds only if the review step actually reviews.

0

78

Eric MacIver

@EricMaciver

8 days ago

@clairevo The Fable design critique matched my experience too - until I started front-loading the design direction: described the aesthetic in a DESIGN.md first, then had it build an HTML-only prototype from that before any real code. Made a night-and-day difference for me.

0

12

Eric MacIver

@EricMaciver

8 days ago

@petergyang Ha! Well captured. For the Haiku card - maybe a glass-cannon speedster, low intelligence and mana - and a local LLM card as the free-to-play character you grind 40 hours to do what the party does in 5.

0

38

Eric MacIver

@EricMaciver

8 days ago

Second agent helps, but I've found that a second agent from a different model family is the real unlock. I run Claude + Codex against the same diff and each consistently catches real bugs the other misses - same family reviewers share the same priors, so they share the same blind spots.

0

5

Eric MacIver

@EricMaciver

9 days ago

@danshipper @every Agreed - possibly the best code READING model. I pointed it at 4 months of Opus-written code - 553K LOC, ~4,500 green tests - and it found 10 P0s. Pricey, so, using Fable for architectural audits and orchestration. Will be tough to want to change back on June 22.

0

26

Eric MacIver

@EricMaciver

9 days ago

Great write-up. I ran the inverse experiment this week: had Fable audit 4 months of code I built with Opus 4.5/4.6 - 553K LOC, ~4,500 tests. It found 10 P0s, including evidence of my prior AI reviewers approving code they couldn't actually parse. Same experience on speed and cost - slow, and it tore through my usage limits. The firepower was substantial though. TBD at sustained prices.

0

34

Eric MacIver

@EricMaciver

10 days ago

Private beta waitlist: https://t.co/U0OAWv3Urq

0

29

Eric MacIver

@EricMaciver

10 days ago

24 hours of Fable 5 auditing the app I've spent 4 months building with Opus 4.5 and 4.6 - 553K LOC, ~4,500 tests. I spec every feature. I run adversarial reviews and blind agent reviews on every implementation round. It still found 10 P0s. The worst: my AI code reviews were approving code they couldn't actually read. And 30 test failures were invisible because the gate command never ran them. Code generation is solved. Verification isn't. I'm building for exactly that - verification that doesn't depend on trusting any single model.

1

0

71

Eric MacIver

@EricMaciver

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users