David Foster @davidADSP - Twitter Profile

about 1 year ago

@mattshumer_ Yeah looks awesome - any idea how they calculated the $0.19-$0.49 PPM tokens? They say it's based on $2/hour H100 cost and serve rate of 0.03 ms / token I think?

0

1

0

116

David Foster @davidADSP

over 1 year ago

@Thom_Wolf It's a reference to the fact that an ensemble of all submissions would have scored 81% on the private test set (i.e. 19% of solutions were unsolved by any solution) https://t.co/pwdwjJFn2P

François Chollet

@fchollet

over 1 year ago

Does this mean the ARC-AGI benchmark has saturated? Yes -- the v1 version of the benchmark is starting to saturate. There were already signs of this in the Kaggle competition this year -- an ensemble of all submissions would score 81%. The competition next year will run on ARC-AGI-2, an updated version of the dataset that keeps the same format as v1, but features fewer tasks that can be easily brute-forced. Early indications are that ARC-AGI-v2 will represent a complete reset of the state-of-the-art, and it will remain extremely difficult for o3. Meanwhile, a smart human or a small panel of average humans would still be able to score >95%.

13

721

33

66

155K

0

3

0

506

David Foster @davidADSP

over 1 year ago

@arcprize @OpenAI o3 solves ARC-AGI? Huuuuge news if that's it...

0

4

0

492

David Foster @davidADSP

over 1 year ago

@fchollet Out of interest @fchollet, what % of arc test set puzzles remain unsolved by any submitted solution? And what would the top 2 entries score if ensembled (I know this means they'd have 4 attempts). Just curious how much they overlap.

0

1

0

105

Who to follow

Tim Roughgarden

@Tim_Roughgarden

Head of Research @a16z. Prof @Columbia. Theoretical computer scientist. Educator. Wrote Algorithms Illuminated, 20 Lectures on Algorithmic Game Theory, etc.

Andrew Zalesky

@AndrewZalesky

neuroscience | neuroimaging | psychiatry | networks | neural engineering

Xincheng Qiu

@XinchengQiu

Assistant Professor @WPCareySchool @ASU starting Fall 2023. @Penn Econ PhD.

David Foster @davidADSP

over 1 year ago

@OfficialLoganK What's the rate limit?

0

89

David Foster @davidADSP

over 1 year ago

@jsuarez @hirschibar Awesome write up! What about action masking - i.e. how do you handle cases where certain actions aren't possible (and the env returns you the mask at each timestep). Is this something PufferLib supports?

2

1

0

55

David Foster @davidADSP

over 1 year ago

@lmarena_ai @01AI_Yi Will the multimodal Llama 3.2 models be added to the overall leaderboard?

0

636

David Foster @davidADSP

over 1 year ago

@lmarena_ai @lmsysorg Will Llama 3.2 (the multimodal models) and Gemini 1.5-002 be added to the main leaderboard?

0

1K

David Foster @davidADSP

about 2 years ago

@SullyOmarr Would you be willing to share the leaderboard from your evals?

1

0

267

David Foster @davidADSP

about 2 years ago

Spot the data viz fail 🤦‍♂️@BBC @BBCPolitics @BBCNews

0

1

0

321

David Foster @davidADSP

over 2 years ago

@giffmana @pastaraspberry What is it about Gemma that makes it open, but not open source? Thanks!

1

0

312

David Foster @davidADSP

over 2 years ago

@giffmana @pastaraspberry How are you defining open vs open source. Thanks!

1

0

341

David Foster @davidADSP

over 2 years ago

@NPCollapse Funny story - William Peebles co-authored the Mar 2023 Diffusion Transformer paper on which Sora is based, whilst at Meta as an intern. But then joined OpenAI last year to co-lead Sora. So I guess they did know how to do it, but let him leave 😂

0

3

0

644

David Foster @davidADSP

over 2 years ago

@Thom_Wolf Theory of Everything: https://t.co/gkxjVsmI8t

0

1

0

65

David Foster @davidADSP

almost 3 years ago

@realGeorgeHotz Given the current breakthroughs, "linguistics" is a left-field candidate 🤔

0

162

David Foster @davidADSP

almost 3 years ago

@nickfloats Does the --iw parameter affect remixes? In the docs it says it doesn't, but I'm never sure how much to trist the docs :)

0

1

0

103

David Foster @davidADSP

almost 3 years ago

@nickfloats Related question / challenge - how do you get Midjourney to output the usual meaning of 'fork in the road', rather than this? Changing the prompt to use different words isn't allowed 😃