Edward Grefenstette

@egrefen

FR/US/GB AI/ML Person, Director of Research at @GoogleDeepMind, Honorary Professor at @UCL_DARK, @ELLISforEurope Fellow. All posts are personal.

London, United Kingdom

Joined April 2007

915 Following

45.8K Followers

16K Posts

Pinned Tweet

Edward Grefenstette @egrefen

5 months ago

🧵 Time for a short end-of-2025 wrap up 🧵 Genuinely aiming for this to be a short one for two reasons: 1. I'm doing it at the last moment 😅 2. Most of what I was involved in is not stuff that can be shared publicly (yet… or ever?). Or maybe I was just lazy... Let's go [1/12]

5

225

21

166

79K

Edward Grefenstette @egrefen

about 17 hours ago

@george__wing @OpenAI Pot-ay-to, po-tah-to

0

0

0

0

153

Edward Grefenstette @egrefen

about 17 hours ago

While @OpenAI can't hire the winner, they COULD buy the winning company. Metahiring!

about 19 hours ago

OpenAI ran a hiring challenge, but the top candidate was one they couldn’t hire: our autonomous research agent, Aiden. In Parameter Golf, Aiden ran for 22 days, and out-outperformed all 1,016 other researchers: 🧵 (1/8)

12

421

42

248

72K

2

58

0

22

15K

egrefen retweeted

about 19 hours ago

OpenAI ran a hiring challenge, but the top candidate was one they couldn’t hire: our autonomous research agent, Aiden. In Parameter Golf, Aiden ran for 22 days, and out-outperformed all 1,016 other researchers: 🧵 (1/8)

12

421

42

248

72K

Who to follow

Tim Rocktäschel

Verified account

Co-Founder @Recursive_SI, Professor of AI @AI_UCL, PI @UCL_DARK, Fellow @ELLISforEurope. Ex @GoogleDeepMind @AIatMeta @CompSciOxford

Hugo Larochelle

Verified account

@hugo_larochelle

Mila Scientific Director. Ex @Google DeepMind & Twitter Cortex. Father of 4. // Directeur scientifique à Mila. Ex @Google DeepMind & Twitter Cortex. Père de 4.

Soumith Chintala

Verified account

@soumithchintala

Building new things @thinkymachines. Also dabble in robotics at NYU. Cofounded @PyTorch. AI is delicious when it is accessible and open-source.

Edward Grefenstette @egrefen

3 days ago

@thingsshldwrk @KlepperCasey They were profit making until the toxic assets that are X and https://t.co/2g8tfIKbmg got bundled up into it. Now it's posting massive losses.

0

6

0

1

211

Edward Grefenstette @egrefen

4 days ago

SpaceX being rammed into indices with no profit requirements, seasoning, and generally looser constraints is economic terrorism. Index trackers will eat the loss when reality catches up and retail investors will suffer.

60

5K

652

179

187K

egrefen retweeted

6 days ago

MLE-Bench scores have jumped from 30% to 80% over the last two years. But how much of that is real algorithmic progress vs. better base models + problem definition shifts + overfitting? Turns out: not much. Once you control for the same step budget and models, and then test on a different set of tasks, the two-year-old AIDE algorithm matches modern agent/evolutionary search systems. Figure from FML-Bench, a new automated ML research benchmark, which unifies the code editing agent, step definition, and val/test split, and tries to benchmark the algorithmic efficiency (search/memory) of the agents. paper link: https://t.co/8QllTan4cX

zhengyaojiang's tweet photo. MLE-Bench scores have jumped from 30% to 80% over the last two years.
But how much of that is real algorithmic progress vs. better base models + problem definition shifts + overfitting?

Turns out: not much. Once you control for the same step budget and models, and then test on a different set of tasks, the two-year-old AIDE algorithm matches modern agent/evolutionary search systems.

Figure from FML-Bench, a new automated ML research benchmark, which unifies the code editing agent, step definition, and val/test split, and tries to benchmark the algorithmic efficiency (search/memory) of the agents.

paper link: https://t.co/8QllTan4cX

6

93

10

59

11K

Edward Grefenstette @egrefen

6 days ago

Pretty sure these are all just Generation 8 Pokémon names.

Rodger Sherman @rodger

6 days ago

Shrey spelling 32 words in 90 seconds to win the Spelling Bee is the new greatest athletic accomplishment of 2026. I don’t even know how he said the letters that fast. Got a “Holy Mackerel” out of @minakimes

348

14K

1K

3K

4M

0

21

0

1

4K

egrefen retweeted

Index Ventures @IndexVentures

7 days ago

The scientific method has been an extraordinary engine of progress. It’s also barely changed in 400 years. Inherent’s bet is that science is on the brink of a second revolution, one built around what humans and self-improving AI can do together. Read more about why we co-led their $50m seed round: https://t.co/dQRl5xajYW

IndexVentures's tweet photo. The scientific method has been an extraordinary engine of progress. It’s also barely changed in 400 years. Inherent’s bet is that science is on the brink of a second revolution, one built around what humans and self-improving AI can do together. Read more about why we co-led their $50m seed round: https://t.co/dQRl5xajYW

2

32

5

14

5K

egrefen retweeted

7 days ago

Proud to announce the launch of @inherent_labs. We’re reinventing the scientific research factory for the age of AI agents. I’m joined by co-founders @kallyaleksiev, @LouisKirschAI and @TantumSCollins; all are deeply technical operators. Time to live within the experiment.

8

162

18

37

19K

egrefen retweeted

7 days ago

We’re excited to introduce Inherent, a lab designed from scratch to build AI agents that discover new knowledge. The coming era of machine-driven scientific inquiry demands a new kind of research institution and a new kind of AI. To achieve our mission, we live within the experiment, recursively self-improving the entire research organisation. We investigate questions including: - What does ‘AI taste’ look like in the sciences, and how can we build an institution that embraces this new aesthetic of discovery? - What new kinds of human-machine teaming will make the most of AI that can truly innovate? - How can we build recursive self-improvement at the collective level that continually increases human agency over outcomes? We have just closed a $50m seed round led by @IndexVentures and @radicalvcfund, with participation from other outstanding investors including NVentures (@nvidia's venture capital arm), @buildexante, Metaplanet, Macroscopic, @MythosVentures, Charlie Songhurst, @chalfs, @jluan, @dwarkesh_sp, @Thom_Wolf, @j_foerst and @maxjaderberg. We are advised by @matthewclifford. Inherent is a Public Benefit Corporation headquartered in London.

inherent_labs's tweet photo. We’re excited to introduce Inherent, a lab designed from scratch to build AI agents that discover new knowledge.

The coming era of machine-driven scientific inquiry demands a new kind of research institution and a new kind of AI.

To achieve our mission, we live within the experiment, recursively self-improving the entire research organisation. We investigate questions including:

- What does ‘AI taste’ look like in the sciences, and how can we build an institution that embraces this new aesthetic of discovery?
- What new kinds of human-machine teaming will make the most of AI that can truly innovate?
- How can we build recursive self-improvement at the collective level that continually increases human agency over outcomes?

We have just closed a $50m seed round led by @IndexVentures and @radicalvcfund, with participation from other outstanding investors including NVentures (@nvidia's venture capital arm), @buildexante, Metaplanet, Macroscopic, @MythosVentures, Charlie Songhurst, @chalfs, @jluan, @dwarkesh_sp, @Thom_Wolf, @j_foerst and @maxjaderberg. We are advised by @matthewclifford.

Inherent is a Public Benefit Corporation headquartered in London.

51

857

105

542

351K

Edward Grefenstette @egrefen

7 days ago

@jparkerholder What have you done now? 😉

0

5

0

0

784

Edward Grefenstette @egrefen

9 days ago

International consensus in tech is rare, and I can't believe we're achieving it today by agreeing that the Luce is probably what Jony Ive pitched for the Apple Car, and got a Ferrari badge slapped on after Apple passed.

10

329

6

20

18K

Edward Grefenstette @egrefen

9 days ago

@auyonomous @willdepue Exactly.

0

1

0

0

45

Edward Grefenstette @egrefen

10 days ago

There will be 3 kinds of scientists in the coming years: 1. The Blenderists, who cover their eyes to ignore the impact of AI. 2. AI scalers like OP(?), who think everything can be solved by making GPUs go brrr. 3. Actual researchers who embrace the tech and explore new frontiers.

11 days ago

academics are unprepared for the coming world where much scientific progress is majorly a function of inference compute. whether OpenAI points the Eye of Stargate at your particular field will decide its acceleration. talent will leach away into the labs. it's already begun

78

2K

84

411

609K

11

180

8

40

26K

Edward Grefenstette @egrefen

10 days ago

@ToddVercoe @_colourmeamused Also used as a noun in the UK in this very specific case: https://t.co/B8cjgL6Je6. Admittedly no relation to the Dutch word.

0

0

0

0

63

egrefen retweeted

@blockchainchick

10 days ago

The SpaceX IPO is the most brazen retail fleecing in modern market history. NASDAQ has REWRITTEN the index rules specifically for this listing. The 10% minimum free float requirement: gone. The 3 to 12 month seasoning period before index inclusion: cut to 15 trading days. Companies with small floats can now be weighted at 3x their actual float. Translation: every passive index fund, every 401k, every pension is about to be force-fed SPCX whether they want it or not. And what exactly are they buying? Class A shares carrying ONE vote each, while Musk holds 93.6% of the Class B super voting shares at TEN votes each. That gives him 85.1% of voting power on a 42% economic interest. He cannot be outvoted. He cannot be removed. CEO, CTO and board chairman simultaneously. For reference: Zuckerberg controls 61% of Meta. Buffett 35% of Berkshire. Musk: 85.1%. SpaceX is also claiming "controlled company" status, exempting it from needing a majority of independent directors. Shareholders waive the right to a jury trial. They waive the right to class actions. Mandatory arbitration only, courtesy of an SEC rule change pushed through on a party line vote last September. $1.75 trillion valuation. $80 billion raise. Largest IPO in history. The rules of the game were quietly rewritten so one man could extract maximum capital from retail while answering to no one.

323

6K

2K

1K

880K

Edward Grefenstette @egrefen

10 days ago

@HrrsMjd Not all domains have the same research dynamics as mathematics.

0

0

0

0

79

Edward Grefenstette @egrefen

10 days ago

@willdepue Yeah I didn't want to assume you held that view (hence the (?) qualifier). But there are a lot of people who fall into the second category unironically. Thanks for providing additional context!

0

6

0

0

595

Edward Grefenstette @egrefen

10 days ago

@HrrsMjd GPUs going brrrr alone isn't going to make me this delicious cocktail I'm enjoying without further work in robotics (mixing the drink), understanding my tastes (personalisation) and that I want one (proactivity), etc. GPUs will brrr as part of it, but there are other parts.

egrefen's tweet photo. @HrrsMjd GPUs going brrrr alone isn't going to make me this delicious cocktail I'm enjoying without further work in robotics (mixing the drink), understanding my tastes (personalisation) and that I want one (proactivity), etc. GPUs will brrr as part of it, but there are other parts. https://t.co/sayStjAl5Q

2

5

0

0

363

Edward Grefenstette @egrefen

10 days ago

@mattiaspado1996 I meant to write Benderists

1

2

0

0

281

Edward Grefenstette @egrefen

10 days ago

I meant to write "Benderists"!

2

6

0

0

2K

Last Seen Users on Sotwe

Trends for you

Most Popular Users