@timfduffy@EpochAIResearch Thanks for flagging this! That part's correct — it's based on the edge case in the "results table" at that link (see screenshot).
OTOH the 512x is a typo, it should be 5^12
I'll update the link w "copy text to highlight" and the second number
@SamuelAlbanie I haven't personally, but one of my colleagues tried using GPT-5.3 codex and made a lot more progress on the "port a post" task. So I think I was way too bearish for task 3, and maybe a bit too bearish for task 1 too...
@dwarkesh_sp@EgeErdil2 Ege knows so much that when I first met him I thought he might have photographic memory. Then I discovered he doesn't, which made him seem even more impressive!
Introducing FrontierMath Tier 4: a benchmark of extremely challenging research-level math problems, designed to test the limits of AI’s reasoning capabilities.
I strongly disagree with @lugaricano’s thread on explosive growth. While the thread raises important points, I think it fails to get to the core of the reasons to believe that explosive growth is plausible – allow me to explain!
1/@EpochAIResearch doubles down on preiction AI will drive 20%+ annual GDP growth. Economists remain skeptical.
This is the defining debate of today: AI builders see infinite prosperity ahead. Economists see the same limits that constrained every technological revolution.🧵 1/13
I do agree that often "progress works itself out of a job", baumol effects and human bottlenecks are important and can make explosive growth less likely (especially in the next few years)
@aidanogara_@ben_j_todd@EpochAIResearch So for GATE specifically I might make updates of the form "I broadly over/underestimated how strong effect X might be". I definitely wouldn't trust GATE's near-term quantitative predictions (e.g. GWP growth rates in 2027)
@aidanogara_@ben_j_todd@EpochAIResearch Depends on the question IMO. GATE is based on endogenous growth models, that are ok at capturing the dynamics of long-run growth, but I doubt you'd use something like this for near-term growth predictions for example
How do reasoning models solve hard math problems?
We asked 14 mathematicians to review o3-mini-high’s raw, unsummarized reasoning traces on 29 FrontierMath problems. Here’s what they found: