Johan Land @landjohan - Twitter Profile

Just scored 76.11% on ARC-AGI 2 — beating public GPT-5.2 and Gemini-3-Pro baselines by >20%, and (as far as I know) the best publicly reported result so far. Approach: what I’d call Multi-Model Reflective Reasoning - Using GPT-5.2, Gemini-3, Opus 4.5 - Long-horizon/multi-step reasoning (~6hrs/problem) - Agentic codegen (>100,000 python calls) - Visual reasoning - Council of judges Fun fact: all solver code was written by Gemini-3-CLI. Does this count as AI generating a new AI that beats the prior SOTA? 🤔 Full run + code (open source): https://t.co/8HZJV5XjIK @GregKamradt , holiday break is over 🙂 semi-private when? #ARCAGI #AIResearch

LandJohan's tweet photo. Just scored 76.11% on ARC-AGI 2 — beating public GPT-5.2 and Gemini-3-Pro baselines by >20%, and (as far as I know) the best publicly reported result so far.

Approach: what I’d call Multi-Model Reflective Reasoning
- Using GPT-5.2, Gemini-3, Opus 4.5
- Long-horizon/multi-step reasoning (~6hrs/problem)
- Agentic codegen (>100,000 python calls)
- Visual reasoning
- Council of judges

Fun fact: all solver code was written by Gemini-3-CLI.
Does this count as AI generating a new AI that beats the prior SOTA? 🤔

Full run + code (open source): https://t.co/8HZJV5XjIK

@GregKamradt , holiday break is over 🙂 semi-private when?

#ARCAGI #AIResearch

16

91

9

31

4K

Johan Land @LandJohan

4 months ago

@pradanadimass @arcprize Oh, I am :) can trade y for x at efficient ratio. This was really going for max y though.

1

0

53

Johan Land @LandJohan

4 months ago

@thedjpetersen @ItsBrain4Brain @arcprize The code is open source: https://t.co/pqWRg79oqy Do whatever you want with it :) I don't think I even put in any license in there so it's free for all!

0

4

0

71

Johan Land @LandJohan

4 months ago

No, but it's all open source. Maybe I should write a paper. Essentially, it's gathering the reasoning traces for all possible solutions. Then it's exposing those to three different judges with slighly different roles. The judges then express their opinions after which a solution is picked.

0

2

1

98

Johan Land @LandJohan

4 months ago

@diegocabezas01 It's all public source: https://t.co/pqWRg79oqy Check the v7 branch, that's the latest. Actually, go back a few commits and you'll find an even higher performing version - I had to dumb it down a bit for the submission.

0

44

Johan Land @LandJohan

4 months ago

@Viam_Invenias_0 @teortaxesTex I think you can get higher with the chinese models with an approach similar to the one I took here.

0

32

Johan Land @LandJohan

4 months ago

@captain_marrvel @OpenAI Beautiful model indeed. Slow though :/

0

8

0

79

Johan Land @LandJohan

4 months ago

@DeryaTR_ Next challenge is indeed ARC-AGI-3! The beautiful thing about ARC-AGI is that they allow "hobbyists" like myself to fairly be benchmarked against the labs.

0

6

0

144

Johan Land @LandJohan

4 months ago

@kimmonismus It's moving fast, indeed! Exciting times ahead!

0

352

Johan Land @LandJohan

4 months ago

@joshlee361 Largely I agree. Few other things to it, but the key thing indeed is that different models/prompts/modalities/chaining generate diverse results. But then, you also need to "know when you know" and "know when you don't know" which is the other half of the problem.

1

2

0

68

Johan Land @LandJohan

4 months ago

@BadTechBandit @arcprize "Indie researcher", I like it :)

0

2

0

47

Johan Land @LandJohan

4 months ago

@permaximum88 Yes! I love the community! Thanks @permaximum88 for everything you're doing!

0

2

0

101

Johan Land @LandJohan

4 months ago

@BillyHoy1_ Not really :)

0

110

Johan Land @LandJohan

4 months ago

@SuperbBias Diversity is the keyword indeed. Of the biggest insights I had was to induce diversity in the models by forcing them to thinking in different spaces and modalities.

1

2

0

47

Johan Land

@LandJohan

Last Seen Users on Sotwe

Trends for you

Most Popular Users