Chi Nguyen

@NguyenSquared

Joined April 2020

71 Following

23 Followers

5 Posts

NguyenSquared retweeted

Caspar Oesterheld @C_Oesterheld

over 1 year ago

How do LLMs reason about playing games against copies of themselves? 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10

C_Oesterheld's tweet photo. How do LLMs reason about playing games against copies of themselves? 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10 https://t.co/pPdZ3VyuLi

102

11K

NguyenSquared retweeted

METR @METR_Evals

over 1 year ago

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

METR_Evals's tweet photo. How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks. https://t.co/woREKEWn5S

829

171

513

445K

Chi Nguyen @NguyenSquared

over 2 years ago

@FinishItPod It happened! I was late but you were early. Thank you @FinishItPod !

Chi Nguyen @NguyenSquared

over 3 years ago

@FinishItPod After 6 weeks of listening, I've just finished episode 106 of (yes, I've listened to the first 9 episodes in one day) Sure hope you still do complis and concris because I want mine in 4 months... Also, the the abominable snowman 1 isn't on spotify

Chi Nguyen @NguyenSquared

almost 6 years ago

@mlipsitch China: second. attack rate child vs. adult (~12% both) https://t.co/HNBzULyMik Ch: Comp. infection rates w contact patterns (child. less suceptible, riskier behavior) https://t.co/1r8zZ7dr1d Ger: Indir. evidence infectiousness: viral load (~same all ages) https://t.co/hmM7gelrN7

Chi Nguyen

@NguyenSquared

Last Seen Users on Sotwe

Trends for you

Most Popular Users