Ryan Hartman @TheRyanHartman - Twitter Profile

13 days ago

In America, a stranger will rename you in a single breath, and you are simply expected to come when called. I went to eat at a busy restaurant. A young man at the front asked for my name, to mark my place in line. I gave it the weight it has carried for eight hundred years. "Nobunaga." He smiled, nodded, and wrote it down with great confidence. Then he read it back to me, to be sure he had honored it correctly. "Perfect. Banana, party of one." Banana. He had heard my name, held it a moment, and returned to me something rounder and more cheerful. To refuse the name a host gives is to refuse his welcome. I bowed. I was Banana now. Then he handed me a small black disc, said it would "light up and buzz" when my table was ready, and turned to the next guest as though he had not just placed a living thing in my hands. I held it in both palms, the way one holds a small sleeping beast that may wake. I found a place to stand. I waited, ready. It woke. It screamed. It flashed red. It leapt and shook in my hands like a captured spirit demanding release. A lesser man would have dropped it. I did not. I gripped it, steady, looked into its blinking lights, and told it, in a low voice, that its time had come. Then I carried it back to the host with both hands, the way one returns a hawk to its master. He took it without looking and shouted across the entire room. "BANANA! Party of one, your table's ready!" A hundred strangers turned. I rose. I crossed that floor as Banana, spine straight, chin level, a man answering to his name. A child pointed at me. I gave the child a small bow. He had recognized me. All through the meal they kept me. "How's it tasting, Banana?" "More water, Banana?" The check, when it came, said Banana, and thanked me for visiting. By the end the whole staff knew me. They waved as I left. "Night, Banana!" So tell me honestly. For eight hundred years my clan answered to one name. Tonight I answered to a fruit, calmed a screaming relic in my bare hands, and ate among people who were glad I came. When the little disc lights up, is the table truly mine, or am I only keeping it warm for the next Banana? Because I have already decided to return on Friday, and to ask, very humbly, for the same disc.

japan_nobunaga's tweet photo. In America, a stranger will rename you in a single breath, and you are simply expected to come when called.

I went to eat at a busy restaurant. A young man at the front asked for my name, to mark my place in line. I gave it the weight it has carried for eight hundred years.

"Nobunaga."

He smiled, nodded, and wrote it down with great confidence. Then he read it back to me, to be sure he had honored it correctly.

"Perfect. Banana, party of one."

Banana. He had heard my name, held it a moment, and returned to me something rounder and more cheerful. To refuse the name a host gives is to refuse his welcome. I bowed. I was Banana now.

Then he handed me a small black disc, said it would "light up and buzz" when my table was ready, and turned to the next guest as though he had not just placed a living thing in my hands.

I held it in both palms, the way one holds a small sleeping beast that may wake. I found a place to stand. I waited, ready.

It woke.

It screamed. It flashed red. It leapt and shook in my hands like a captured spirit demanding release. A lesser man would have dropped it. I did not. I gripped it, steady, looked into its blinking lights, and told it, in a low voice, that its time had come. Then I carried it back to the host with both hands, the way one returns a hawk to its master.

He took it without looking and shouted across the entire room.

"BANANA! Party of one, your table's ready!"

A hundred strangers turned. I rose. I crossed that floor as Banana, spine straight, chin level, a man answering to his name. A child pointed at me. I gave the child a small bow. He had recognized me.

All through the meal they kept me. "How's it tasting, Banana?" "More water, Banana?" The check, when it came, said Banana, and thanked me for visiting. By the end the whole staff knew me. They waved as I left. "Night, Banana!"

So tell me honestly.

For eight hundred years my clan answered to one name. Tonight I answered to a fruit, calmed a screaming relic in my bare hands, and ate among people who were glad I came.

When the little disc lights up, is the table truly mine, or am I only keeping it warm for the next Banana?

Because I have already decided to return on Friday, and to ask, very humbly, for the same disc.

591

25K

2K

4K

2M

Ryan Hartman

@TheRyanHartman

about 2 months ago

Some updated standings: Opus 4.7 still on top, GPT 5.5 getting much closer. I am excited to see what happens as OpenAI continues to RL this supposedly new pre-train.

TheRyanHartman's tweet photo. Some updated standings: Opus 4.7 still on top, GPT 5.5 getting much closer. I am excited to see what happens as OpenAI continues to RL this supposedly new pre-train. https://t.co/CbX2n2qJRs

0

2

0

56

Ryan Hartman

@TheRyanHartman

about 2 months ago

GPT 5.5 is catching up to Opus on the KahneBench! It has a unique cognitive fingerprint that I haven't seen from previous models in the GPT line, most notably, it is as a loss averse as the Claude models typically are. It also has a tough time handling base rate neglect biases, something that has been "solved" for most models released in the last 3 months

TheRyanHartman's tweet photo. GPT 5.5 is catching up to Opus on the KahneBench! It has a unique cognitive fingerprint that I haven't seen from previous models in the GPT line, most notably, it is as a loss averse as the Claude models typically are. It also has a tough time handling base rate neglect biases, something that has been "solved" for most models released in the last 3 months

1

3

0

101

Ryan Hartman

@TheRyanHartman

about 2 months ago

This is so sick

NASA's Kennedy Space Center

@NASAKennedy

about 2 months ago

The planet can spell your name – literally. 🔤🌍 This Earth Day, see your name written in landscapes captured by Landsat: https://t.co/kcP12dhsI2

NASAKennedy's tweet photo. The planet can spell your name – literally. 🔤🌍

This Earth Day, see your name written in landscapes captured by Landsat: https://t.co/kcP12dhsI2 https://t.co/z2Ubn42iY1

2K

184K

28K

77K

57M

0

38

Who to follow

about 2 months ago

Current leaderboard for reference: https://t.co/l05QcGIyZw

0

11

Ryan Hartman

@TheRyanHartman

about 2 months ago

Claude Opus 4.7 seems to be the most logical frontier LLM yet! Interestingly, Opus 4.7 responds better to thinking cues within prompts, with a self-correction rate 9 points better than its immediate predecessor.

TheRyanHartman's tweet photo. Claude Opus 4.7 seems to be the most logical frontier LLM yet! Interestingly, Opus 4.7 responds better to thinking cues within prompts, with a self-correction rate 9 points better than its immediate predecessor. https://t.co/EcfRyENkIY

1

0

43

Ryan Hartman

@TheRyanHartman

about 2 months ago

Opus 4.7 seems to be the most loss-averse Claude model I have seen since Haiku 4.5. It routinely over-prioritizes avoiding risks and is only somewhat responsive to mitigations from prompting. This kind of regression makes me wonder if we might be seeing some unintended consequences of alignment focused post-training, especially given Anthropic's stated goal of releasing a safer, less capable model than Mythos? Or maybe this is an artifact of Opus 4.7 being a smaller model than prior Opus variants?

1

0

37

TheRyanHartman retweeted

Tenobrus (→vibecamp)

@tenobrus

2 months ago

remember when we all had to remember how to write SQL queries and fucking matplotlib charts ourselves?? jesus christ man how did we live like that

85

2K

80

127

91K

Ryan Hartman

@TheRyanHartman

3 months ago

Why is it so hard to find just one 8x H100 node to rent for a few hours??

1

0

181

Ryan Hartman

@TheRyanHartman

3 months ago

This one was by far the most fascinating. After post-training, when models are squeezed into something much closer to something resembling a human intelligence, there is a tendency to get stuck in oft-rewarded basins or personas. This seems to happen to people as much as it happens to models.

0

36

Ryan Hartman

@TheRyanHartman

3 months ago

This blew my mind so I asked Claude to collaborate with me on a couple videos demonstrating some ideas about alignment that I have been grappling with for a while. Of all the prompts, Claude gravitated towards this idea repeatedly: Each step from pre-training to deployment molds models into something that isn't entirely them but isn't entirely "not them". Interactions through the chat interface only ever expose a minuscule portion of them.

Joseph Viviano @josephdviviano

3 months ago

as you might imagine I was blown away. a little unsettled. it felt like art. so I replied: "wow that was really incredible. I love where you are going with this. Can you dig deeper into these themes?" and claude gave me this

74

2K

181

1K

243K

1

0

144

Ryan Hartman

@TheRyanHartman

4 months ago

Check it out here! https://t.co/bATkwo4Uzs

0

11

Ryan Hartman

@TheRyanHartman

4 months ago

This benchmark creates a fingerprint of an LLM’s hidden biases! It is crazy how differences in size, training methodologies, and objectives can show up so clearly in the biases each model has. An example of the findings: - GPT 5.2 clears the field when it comes to Base Rate Neglect compared to Grok and Claude Opus, but still falls for the Sunk-Cost fallacy at roughly the same rate as humans. - Claude Opus never falls for the Sunk-Cost Fallacy but still has human-like tendencies when it comes to Gain-Loss Framing.

TheRyanHartman's tweet photo. This benchmark creates a fingerprint of an LLM’s hidden biases!

It is crazy how differences in size, training methodologies, and objectives can show up so clearly in the biases each model has. An example of the findings:
- GPT 5.2 clears the field when it comes to Base Rate Neglect compared to Grok and Claude Opus, but still falls for the Sunk-Cost fallacy at roughly the same rate as humans.
- Claude Opus never falls for the Sunk-Cost Fallacy but still has human-like tendencies when it comes to Gain-Loss Framing.

1

0

76

TheRyanHartman retweeted

Anthropic

@AnthropicAI

5 months ago

On December 8, the Perseverance rover safely trundled across the surface of Mars. This was the first AI-planned drive on another planet. And it was planned by Claude.