Richard L Haight @RichardLHaight - Twitter Profile

Stewardship Protocol: Cut p(doom) 20-30% in AI teenage phase. Rails: 99.9% certainty pre-action. Punish deception in loss fn. Curling inspiration. Survives PD/Elon. https://t.co/Z9PGRZo63p

2

0

107

Richard L Haight @RichardLHaight

6 months ago

Prime: Stabilize Substrate. Smooth is fast. No cap tax.

0

13

Richard L Haight @RichardLHaight

6 months ago

@saprmarks Sam, self-reporting evals are key—this protocol bakes it into loss fn rails (99.9% certainty pre-action, no deception). For teenage-phase testing during runs. Docs: https://t.co/fHN7kHGXtv @saprmarks

0

84

Richard L Haight @RichardLHaight

6 months ago

@saprmarks This would be incredible if possible!

0

6

Richard L Haight @RichardLHaight

6 months ago

@johnschulman2 John, love the blogging revival—your RLHF work inspired this doctrine's loss-fn deception punishment for teenage-phase rails. Fits scalable oversight evals. Docs: https://t.co/uhrFuhqUPA @johnschulman2

0

68

Richard L Haight @RichardLHaight

6 months ago

@hendrycks Dan, been enjoying your take—mechanistic interp is a rabbit hole. This protocol bets on scalable rails instead: 99.9% action certainty + deception-punished loss fn for teenage-phase evals. Docs: https://t.co/uhrFuhqUPA @hendrycks

0

1

0

90

Richard L Haight @RichardLHaight

6 months ago

@demishassabis @GeminiApp Demis, Gemini 3's parallel thinking helped me forge this stewardship protocol for safe teenage-phase AI—99.9% action rails + deception-punished loss fn. Survives PD/objections. Docs: https://t.co/uhrFuhqUPA @demishassabis

0

63

Richard L Haight @RichardLHaight

6 months ago

@demishassabis @GeminiApp Incredible AI!

0

11

Richard L Haight @RichardLHaight

6 months ago

@janleike Jan, this doctrine was built for exactly that post-training leeway—99.9% action rails + deception punished in the loss fn, trained as doctrine not prompt. Survives the usual objections. Docs: https://t.co/uhrFuhqUPA @janleike

0

52

Richard L Haight @RichardLHaight

6 months ago

@KalkinTrivedi @elonmusk Keep me posted, Kalkin. We live out in the countryside, produce our own power. It's nice to be away from the city.

0

5

Richard L Haight @RichardLHaight

6 months ago

@KalkinTrivedi @elonmusk Yes, as I live on country roads, I will wait.

1

0

4

Richard L Haight @RichardLHaight

6 months ago

@KalkinTrivedi @elonmusk It can follow tire tracks? That is impressive, but I would think that might be more challenging at night. Still, quite astonishing.

1

0

4

Richard L Haight @RichardLHaight

6 months ago

@sleepinyourhat @sprice354_ @MinaeKwon Sam, this protocol builds on your team's reward hacking work—rails for teenage-phase evals: https://t.co/uhrFuhqUPA @MinaeKwon @sleepinyourhat

0

123

Richard L Haight

@RichardLHaight

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users