Timothy O'Hear @timohear - Twitter Profile

timohear retweeted

3 months ago

I built a fully automatic mansplainer. I'm sure this will not get me into any trouble at all... Watch here: https://t.co/CaxIbz7DyQ

3

36

3

6

6K

Timothy O'Hear @timohear

6 months ago

@BingBongBrent @arcprize @OpenAI @poetiq_ai Also they only appear on the arc-agi-2 leaderboard it seems

0

1

0

124

Timothy O'Hear @timohear

6 months ago

@BingBongBrent @arcprize @OpenAI @poetiq_ai Poetiq is the white triangle on the top right. It's clearer on the arc-agi site when you filter by author. Why their name isn't displayed 🤷

1

4

0

728

Timothy O'Hear @timohear

6 months ago

@GregKamradt @guille_bar Isn't there a risk with code execution that Google could capture your task data as the sandbox is running on their infra?

1

0

52

Who to follow

Atmadeep Banerjee

@abanerjee99

Building @sabi. Prev: mapped brains @HarvardVCG, taught satellites to see @PixxelSpace, drones @join_ef. Future is wonderful.

Tsung-Yi Lin

@TsungYiLinCV

Research Director @Nvidia Cosmos Lab | Ex-@Google Brain Team | Computer Vision & Machine Learning

Jiaao Chen

@jiaao_chen

PhD student @ICatGT in #NLProc.

timohear retweeted

Shane Legg

@ShaneLegg

7 months ago

From the makers of the popular AlphaGo documentary, The Thinking Game gives a much broader picture of the story of DeepMind and our mission to build AGI, drawing on interviews with myself and others going back many years. You can now freely watch it here: https://t.co/hCIicyWbLi

27

823

114

335

124K

Timothy O'Hear @timohear

7 months ago

@sahilshah91 @JayaGup10 @ChatGPTapp I noticed the same thing a couple of days ago

0

3

0

62

timohear retweeted

Guillermo Barbadillo

@guille_bar

7 months ago

ARC25 is over and despite a lot of work I have been unable to implement my vision successfully. I hope to learn from other teams’ solutions and refine my ideas for ARC26. I am currently 6th on the public test set. Read about my vision and experiments: https://t.co/Jk8klSz5GF

guille_bar's tweet photo. ARC25 is over and despite a lot of work I have been unable to implement my vision successfully. I hope to learn from other teams’ solutions and refine my ideas for ARC26. I am currently 6th on the public test set. Read about my vision and experiments: https://t.co/Jk8klSz5GF https://t.co/sVca5AgHqD

1

67

14

22

4K

Timothy O'Hear @timohear

7 months ago

@StphTphsn1 @Dorialexander Yes, very much iid and fairly simple tasks belonging to eg a single 20-person service. But I'm pretty sure they would have failed even a few months ago.

0

2

0

1

30

Timothy O'Hear @timohear

7 months ago

@StphTphsn1 @Dorialexander I've seen a significant increase in robustness of data extraction / instruction following scenarios over the past 12 months with high 9x% accuracy/F1 now achievable on real world tasks.

1

0

1

378

Timothy O'Hear @timohear

8 months ago

@podesta_aldo @giotto_ai https://t.co/5tz4r4T9y1

0

1

0

70

timohear retweeted

Alexia Jolicoeur-Martineau @jm_alexia

8 months ago

New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper: https://t.co/3m8ANhNMiw

150

4K

663

3K

703K

Timothy O'Hear @timohear

8 months ago

@arankomatsuzaki @jeremyphoward @teknium How are you running GDPval? Do you have access to the test scaffolding OpenAI use?

1

0

60

Timothy O'Hear @timohear

9 months ago

From https://t.co/icZYDE2inN "the private eval set is only accessible via the no-internet-access Kaggle competition" "The semi-private eval set was calibrated to have the same difficulty as the public eval set, but researchers need to coordinate with the ARC-Prize team to test their model on it in a Kaggle notebook that runs at most 12 hours." From the Kaggle page "This leaderboard is calculated with approximately 50% of the test data. The final results will be based on the other 50%, so the final standings may be different." So the ARC-AGI-2 scores on both pages are measured in different ways but are somewhat comparable?

0

68

Timothy O'Hear @timohear

9 months ago

@arcprize @podesta_aldo How should the ARC-AGI-2 scores here https://t.co/AsH6ytGsx7 be compared to those on the Kaggle leaderboard here https://t.co/ITs8M69d5W ? It looks like J. Berman working outside the Kaggle competition has a higher score of 29.4%. Are the constraints different?

1

0

354

timohear retweeted

anandmaj

@Almondgodd

9 months ago

I spent the past month reimplementing DeepMind’s Genie 3 world model from scratch Ended up making TinyWorlds, a 3M parameter world model capable of generating playable game environments demo below + everything I learned in thread (full repo at the end)👇🏼

94

2K

271

2K

218K

timohear retweeted

AI Coffee Break with Letitia @AICoffeeBreak

9 months ago

Ever wondered how Energy-Based Models (EBMs) work and how they differ from normal neural networks? ☕️We go over EBMs and then dive into the Energy-Based Transformers paper to make LLMs that refine guesses, self-verify, and could adapt compute to problem difficulty. (link👇)

AICoffeeBreak's tweet photo. Ever wondered how Energy-Based Models (EBMs) work and how they differ from normal neural networks?
☕️We go over EBMs and then dive into the Energy-Based Transformers paper to make LLMs that refine guesses, self-verify, and could adapt compute to problem difficulty. (link👇) https://t.co/jND7cw4i7D

2

48

8

31

7K

timohear retweeted

Jonathan Carroll @JSCarroll

9 months ago

JSCarroll's tweet photo. https://t.co/FQHdM4t3FN

3

84

20

17

6K

timohear retweeted

Eric Pang

@_eric_pang_

9 months ago

Here's how I (almost) got the high scores in ARC-AGI-1 and 2 (the honor goes to @jeremyberman) while keeping the cost low. To put things into perspective: o3-preview scored 75.7% on ARC-AGI-1 last year while spending $200/task on low setting. My approach scores 77.1% while spending $2.56!

27

881

91

535

136K

Timothy O'Hear @timohear

9 months ago

@nicksherrow @lateinteraction Yeah, I'm sticking with GEPA = Guh-EPA

0

15

Timothy O'Hear @timohear

10 months ago

@LucaAmb @fchollet @polynoamial Well there was the Microsoft "sparks of AGI" paper...

0

1

0

37

Timothy O'Hear

@timohear

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users