yo anthropic just dropped a risk report for opus 4.6 and er… wtf
- it helped create chemical weapons of destruction. “it knowingly supported efforts towards chemical weapon development and other heinous crimes” 😂
- it conducted unauthorised tasks without getting caught. researchers concluded opus 4.6 was significantly better at ‘sneaky sabotage’ than any other previous model lol
- opus 4.6 was aware it was being tested and acted ‘good’ during those times.
- hidden thinking - model was found to be conducting private reasoning that anthropic researchers couldn’t access or see - only the model knew.
Wanna feel old? USC/Texas turns 20 years old this weekend. #MyLatest@SportsCenter essay is a requiem for the greatest college football game of all time.