尬聊 @HoboJerk - Twitter Profile

Claude Fable 5 doesn’t truly understand. And here is a beautiful proof: The Beninatto-Trombetti test is a translation test for professional translators. It measures the ability to infer context, revise the surface form, and generalize beyond literal mapping. For example, the correct translation of: “Solo 3 parole: non sei solo” is not: “Just 3 words: you are not alone” but: “Just 4 words: you are not alone.” An LLM that understands the sentence must also update the meta-linguistic claim inside the sentence. Claude Fable 5 is arguably the most advanced LLM currently available. And yet it still fails this simple test. LLMs are extraordinary machines for recombining existing knowledge. But they don’t truly understand. We are still far from AGI.

ValerioCapraro's tweet photo. Claude Fable 5 doesn’t truly understand. And here is a beautiful proof:

The Beninatto-Trombetti test is a translation test for professional translators. It measures the ability to infer context, revise the surface form, and generalize beyond literal mapping.

For example, the correct translation of:

“Solo 3 parole: non sei solo”

is not:

“Just 3 words: you are not alone”

but:

“Just 4 words: you are not alone.”

An LLM that understands the sentence must also update the meta-linguistic claim inside the sentence.

Claude Fable 5 is arguably the most advanced LLM currently available. And yet it still fails this simple test.

LLMs are extraordinary machines for recombining existing knowledge. But they don’t truly understand.

We are still far from AGI.

235

1K

113

631

405K

0

14

尬聊 @HoboJerk

about 10 hours ago

@JohnLeFevre No idea who you are but you could never waterboard this out of me what are you doing man

0

23

尬聊 @HoboJerk

about 10 hours ago

@shakoistsLog Lmao why is he posting this

0

1

0

379

尬聊 @HoboJerk

about 10 hours ago

I still don't get how one can be purely defensive. Spotting a bug is step one to both offense and defense.

OpenAI

@OpenAI

3 days ago

Patch the Planet is our effort to help open source maintainers move from security findings to merged fixes. We’re working with Trail of Bits, HackerOne, Calif, researchers, and maintainers to bring Codex Security and advanced models into the remediation process, with human review at the center.

36

863

54

161

382K

0

9

尬聊 @HoboJerk

1 day ago

@woke8yearold Defections during a shutdown of fable. So over?

0

188

尬聊 @HoboJerk

1 day ago

@scootykins @atreides_sf Deported

1

8

0

433

尬聊 @HoboJerk

1 day ago

@tszzl You'd probably want a track record of being mildly right about the supposed dangers before killing millions of people by stopping progress

0

10

尬聊 @HoboJerk

1 day ago

@leepavelich Jerry, this is a smoke test Dewwww dew dew dewww

0

2

0

175

尬聊 @HoboJerk

2 days ago

@woke8yearold The bull case

0

26

尬聊 @HoboJerk

2 days ago

@aelfred_D @d08890 Uplifted Monke Version

0

2

尬聊 @HoboJerk

2 days ago

@007_lunar @nocontextmemes Adding this to all my memes to make you mad

0

2

0

211

尬聊 @HoboJerk

2 days ago

@woke8yearold @viktorg475 @xctlot I've had in person conversations like this which is very frustrating.

0

15

尬聊 @HoboJerk

3 days ago

@MorrivarGG @stikves @GuardTm Former I agree disagree with the latter

1

0

8

尬聊 @HoboJerk

3 days ago

@woke8yearold @Yuchenj_UW It's not tenable long term to ban use if the companies can't recoup some costs for each model... Unless nationalization? Really curious what's the end game is for 2026

0

146

尬聊

@HoboJerk

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users