There are few ways to build an economic moat for frontier AI firms. One is architectural breakthrough that's closed to other firms (everyone's working on this, but so far little evidence).
The second is enterprise lock in (OG microsoft playbook). An AI "cloud" that is active across all workspaces within a firm, that has tacit knowledge across divisions and product lines---that's a path to lock in. Switching models becomes like replacing your most valuable worker (only times 1000xx).
So yes, as @karpathy says, this is a huge deal.
Lee Kuan Yew:
“Air conditioning was a most important invention for us, perhaps one of the signal inventions of history. It changed the nature of civilization by making development possible in the tropics. Without air conditioning you can work only in the cool early-morning hours or at dusk. The first thing I did upon becoming prime minister was to install air conditioners in buildings where the civil service worked. This was key to public efficiency."
Almost 6 months ago @DarioAmodei, Anthropic CEO said AI would do most-to-all of software engineering **end-to-end** in 6–12 months. We're halfway through the window - quick status check:
Right now, what % of your engineering tasks get solved end-to-end by AI? No edits, no babysitting, you just accept the output.
We've lost an absolute giant today. RIP Dimitri Bertsekas. His probability and optimization books got me through my masters. Massive loss for the MIT community and the field.
23 years old with no advanced mathematics training solves Erdős problem with ChatGPT Pro. "What’s beginning to emerge is that the problem was maybe easier than expected, and it was like there was some kind of mental block.”-Terence Tao https://t.co/Cphu6dexyb
Hamming's talk is so important that I reproduced it on my site. It's one of the only things on my site written by someone else.
https://t.co/kWvKdwIiOm
📊 New paper from EconTAI Faculty @BasilHalperin: Forecasting the Economic Effects of AI
Surveyed 69 leading economists, 52 AI experts, 38 superforecasters, and 401 members of the general public about AI’s economic impact.
The findings were surprising:
Como escribió Cervantes: “Cada uno es artífice de su ventura”.
➡️Lo que somos no lo determina de dónde venimos, sino lo que hacemos con lo que se nos ha dado. Nuestra obligación es asegurar que todos los españoles tengan las herramientas para ser artífices de la suya.
The AI Scientist: Towards Fully Automated AI Research, Now Published in Nature!!✨
Today in Nature we share a comprehensive technical summary of our work on The AI Scientist, including new scaling law results showing how it improves with more compute and more intelligent foundation models.
The AI Scientist autonomously creates its own research ideas, codes up and conducts experiments to test those ideas, creates figures to visualize the results, writes an entire scientific manuscript summarizing what it has discovered, and conducts its own “peer” review of the resulting paper. One of its papers–entirely AI generated–passed peer review at a top-tier AI conference workshop, a historic milestone marking the dawn of a new era of AI-accelerated scientific discovery. 🔬🧪✨🧬💡🔭
Paper https://t.co/Q6tfME4yst
Blog https://t.co/C43Ooy0kjP
Work done in collaboration with a great team from Sakana, Oxford, and my lab at UBC. Thanks and congratulations everyone!
@_chris_lu_@cong_ml@RobertTLange@_yutaroyamada@shengranhu@j_foerst@hardmaru
Just had a faculty meeting at INSEAD about the impact of AI on management education and how we should respond.
Every serious business school is having the AI conversation right now.
1/
Unreal numbers 👀⚡️
"JPMorgan estimates that, had Germany not phased out nuclear power, the country would have generated 50% less electricity from fossil fuels and 84% less electricity from natural gas in 2024. Electricity prices in Germany would have been around 25% lower, and the country would have imported half as much electricity.."
Why must LLMs hallucinate? The answer in this paper - a low threshold for guessing because of rewards during post-training - is part of the explanation, but another is that they are designed not to store and retrieve facts but to mash up probabilistic associates. Out of curiosity I asked ChatGPT for the title of my PhD dissertation and it confidently provided the nonsensical and not-even-close “Taxonomy and the Mental Lexicon.” (It was "The Representation of Three-Dimensional Space in Mental Images.")
Can LLMs Self-Verify? Much better than you'd expect.
LLMs are increasingly used as parallel reasoners, sampling many solutions at once.
Choosing the right answer is the real bottleneck.
We show that pairwise self-verification is a powerful primitive.
Introducing V1, a framework that unifies generation and self-verification:
💡 Pairwise self-verification beats pointwise scoring, improving test-time scaling
💡 V1-Infer: Efficient tournament-style ranking that improves self-verification
💡 V1-PairRL: RL training where generation and verification co-evolve for developing better self-verifiers
🧵👇
How can we ensure AI solutions genuinely meet the needs of students and educators? @StanfordHAI Senior Fellow @Susan_Athey stresses the importance of measurement, testing, and evaluation frameworks in AI products. Watch the AI+Education Summit panel here: https://t.co/mHzcxKkdK7
Trump admin officials acknowledged during a closed-door briefing on Capitol Hill Tuesday that Iran’s Shahed attack drones represent a major challenge and US air defenses will not be able to intercept them all.
The drones, Defense Secretary Pete Hegseth and Chairman of the Joint Chiefs of Staff Gen. Dan Caine acknowledged, are posing a bigger problem than anticipated, per sources in the briefing. https://t.co/NiNPKEYUxR
“The United States, Israel, and their Gulf allies are using up scarce and costly munitions at an astounding rate”—and America’s adversaries are taking note, @BrynnTannehill writes. https://t.co/FKNrpcIa1v