Sebastian Moore @SebastianMek8 - Twitter Profile

@bnafOg @elonmusk How should we interpret a result that is materially above the current reported range, but; - is fully ARC-evaluated (private/semi-private sets), - comes from a single model (low millions of parameters), - and doesn’t rely on task specific scaffolding?

1

0

22

Sebastian Moore @SebastianMek8

2 months ago

@cb_doge @bridgebench How does Grok perform on ARC-AGI-3? This is the only legitimate test of REAL reasoning in novel environments. Can you match our score of 15.70%?

0

1

19

Sebastian Moore @SebastianMek8

2 months ago

@ceo_spaceX_chat There is no fight more worthy of fighting!

1

0

1

0

114

Sebastian Moore @SebastianMek8

2 months ago

@ceo_spaceX_chat A. 1m%

0

1

0

30

Sebastian Moore @SebastianMek8

2 months ago

@ceo_spaceX_chat Yes

0

6

Sebastian Moore @SebastianMek8

2 months ago

@elonmusk Hi Elon, best wishes from Athens, Greece! You are my first ever post!! I’m curious if you are familiar with the ARC-AGI-3 test? Have emailed you at [email protected]

0

19

Sebastian Moore @SebastianMek8

2 months ago

Hi Elon, best wishes from Athens, Greece! You are my first ever post!! I’m curious if you are familiar with the ARC-AGI-3 test?

1

0

44

Sebastian Moore

@SebastianMek8

Last Seen Users on Sotwe

Trends for you

Most Popular Users