@Leon_F_T My favorite trees are the trees native to a given region. If you love Italian trees, go live in Italy. Don't plant them where they're exotic.
New paper: You can’t interpret LLM reasoning from one chain-of-thought. You must study a distribution of possible trajectories!
Repeated sampling reveals: self-preservation doesn’t drive LLM blackmail, unfaithful reasoning reflects a biased path, & resampling steers behavior. 🧵