Jake.

@RedBl0ckBlue

Trust in the LORD with all your heart and lean not on your own understanding

somewhere over the rainbow

Joined March 2023

153 Following

1.2K Followers

1.2K Posts

Jake. @RedBl0ckBlue

7 months ago

@martinmbauer The ‘weirdness’ is the gap between logic and intuition. If you claim it's not weird, you need to give an intuitive explanation, not a logical one. You did a trivial manipulation of symbols using a system humanity spent 1000s of years developing precisely to overcome that gap.

0

2

0

1

101

Jake. @RedBl0ckBlue

7 months ago

@JohnathanBi Is this a library? lecture theatre? production set?

0

0

0

0

79

Jake. @RedBl0ckBlue

7 months ago

@Ganeshuor

RedBl0ckBlue's tweet photo. @Ganeshuor https://t.co/vl9unDvMWw

0

7

0

1

417

Jake. @RedBl0ckBlue

about 1 year ago

@dylan522p @alexandr_wang Scale AI is a glorified mechanical turk, it's strange how all major AI labs are handing over vast amounts of proprietary data to them (the RLHF model interaction data). What does Scale do with this data?

0

1

0

0

86

Who to follow

David Wright ☀️

Verified account

@_david__wright_

Husband. Bonus Dad to 3 amazing young adults. Available to boost your startup growth as a human with experience and AI skills 📈

Verified account

@nickbakeddesign

https://t.co/sMzgj3dYaK Product Design Agency for Founders | ex @Meta, Netflix, Expa | https://t.co/80TlhFDlMu | https://t.co/MHoqGBHA8Y

Verified account

Marketing & Growth Lead at @chatbase | Currently at $10M ARR Co-host of The Morning Maker Show → https://t.co/CqHNsvoXvv 🌸

Jake. @RedBl0ckBlue

about 1 year ago

@ItsDave_ADA thanks dave, all i needed was the idea (the hard part) and now ill prompt chatgpt and this should be implemented soon

1

0

0

0

70

Jake. @RedBl0ckBlue

about 1 year ago

@Evan_Mann @balajis Verifying an AI output in this context doesn't mean cryptographically verifying it, it means "is the answer to the math problem I gave it actually correct, or is it hallucinating?"

1

1

0

0

39

Jake. @RedBl0ckBlue

about 1 year ago

@ck_oro What about the data that isn't easily passively acquired by plugging yourself in? I.e., how would you collect the data that is the 'chain of thought' of a professional mathematician (not just solving a math competition problem, but conducting research, warts and all)

0

0

0

0

62

Jake. @RedBl0ckBlue

about 1 year ago

@xlr8harder @natolambert It's all infrastructure and data, the engineers are hopping from one company to the next, so there's no secret architecture or training method (if there was then it doesn't last long).

0

4

0

0

77

Jake. @RedBl0ckBlue

about 1 year ago

@mamboitaliano__ @EleliaRose

0

2

0

0

90

Jake. @RedBl0ckBlue

about 1 year ago

@Catholicizm1 @EleliaRose

0

0

0

0

48

Jake. @RedBl0ckBlue

over 1 year ago

@ElliotGlazer There’s little point evaluating a model trained on informal math on this (o3)… of course they wouldn’t be able to answer! I wonder how AlphaProof would do, or a fine-tuned o3 like you say

0

2

0

0

106

Jake. @RedBl0ckBlue

over 1 year ago

@MarioNawfal @luismbat The pattern suggests it holds for all n because it holds for n=1 to n=4… proof by pattern

0

0

0

0

53

Jake. @RedBl0ckBlue

over 1 year ago

@thomasahle @wtgowers NuminaMATH dataset scraped all competition problems (regional olympiads, nations, international, and short listed) and Putnam completion problems, and this dataset is 860k, so I’m guessing it’s similar to this dataset. There’s aboit 100k actual hard questions in total, rest easy

0

3

1

1

477

Jake. @RedBl0ckBlue

over 1 year ago

@LaurentSartran @hbouammar @GoogleDeepMind @JeffDean Would you say data is a bottleneck? I assume you used something like NuminaMath, with about 100k 'difficult' questions. The model is able to construct many variations within vicinity of problems, but how would an extra 100k unique interesting problems (olympiad-ish level) help?

1

0

0

0

84

Jake. @RedBl0ckBlue

over 1 year ago

@LaurentSartran @hbouammar @GoogleDeepMind @JeffDean How come the fine-tuned gemini model isn't able to automatically formalize the problem? I realize you need to be certain that the translation is correct, but didn't you automatically formalize 1million questions during training? Great work btw (I know I'm late!)

1

0

0

0

86

RedBl0ckBlue retweeted

over 1 year ago

Jesus Christ is God

trad_west_'s tweet photo. Jesus Christ is God https://t.co/CZpVwdP0rk

65

9K

584

194

95K

Jake. @RedBl0ckBlue

over 1 year ago

@Cobratate You’re such a Muslim

0

0

0

0

35

Jake. @RedBl0ckBlue

over 1 year ago

@ArtemisConsort No, this is wrong and dangerous

0

1

0

0

29

Jake. @RedBl0ckBlue

over 1 year ago

@KaixuanHuang1 It would be good if you released the perturbed dataset you created because with the one example you gave I found o1, GPT-4o, and DeepSeek R1 all gave correct answers, and I repeated many times.

0

1

0

0

187

Jake. @RedBl0ckBlue

over 1 year ago

@Enzorouxx Good summary! RL is limitless when we have a verifiable generator AND verifiable discriminator (e.g. Go) but in domains like formal math, we only have a verifiable discriminator (Lean) but no way to systematically generate valid problem-solution pairs beyond a brute force search

0

1

0

0

39

Last Seen Users on Sotwe

Trends for you

Most Popular Users