Dear Frontier Lab,
Please saturate all the following benchmarks by the end of the year:
- ARC-AGI-3
- Humanity's Last Exam
- ENIGMAEVAL
- Remote Labor Index
- SimpleBench
- VideoGameBench
- OSWorld
- FrontierMath
- RE-Bench
Thank you!
@peterwildeford I think people will still believe the unhinged false flag because they already believe in way dumber propaganda like the AI water usage panic
@deepestbrew@peterwildeford The difference is you can scale the deployment of mythos, but human labor is inherently scarce, thereโs a reason they didnโt found these vulnerabilities before mythos came around
@chenna1985@spicey_lemonade โuntested unreliable productsโ
It have solved multiple erdos problems, that seems pretty tested to me
โbig unethical corporationโ
OpenAI is not unethical, you arenโt more ethical than them, I think you associate โbigโ and โcorporationโ automatically with evil