Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average:
Gemini: 52% โ 57%
GPT-5.4: 82% โ 88%
Opus: 77% โ 93%
His benchmark measures how well models implement a 100-feature PRD. @cursor_ai consistently outperformed.
โฆ๏ธj*wish chick tried to press Myron about the H*Iocaust and he hit her with the UNO reverse card!๐ญโก๏ธ๐
โDo you think that lsraeI did a genocide?โ
โThatโs a complicated question..โ