FENG CHEN @FCHEN_AI - Twitter Profile

6 days ago

Excited and grateful to have contributed to reasoning, RL, and STEM climbing behind the MAI-Thinking-1 model. This has been an intense but fun hill-climbing journey from scratch. Please check out the technical report: https://t.co/qfvkPYKTgN

0

2

0

58

FCHEN_AI retweeted

Surya Ganguli

@SuryaGanguli

12 months ago

The best part of this job is seeing students graduate and launch their careers! Congrats to Feng Chen, Atsushi Yamamura, Tamra Nebabu, Linnie Wharton and Daniel Kunin. They are all going on to top positions across artificial intelligence, medicine, and physics. Proud of you!

SuryaGanguli's tweet photo. The best part of this job is seeing students graduate and launch their careers! Congrats to Feng Chen, Atsushi Yamamura, Tamra Nebabu, Linnie Wharton and Daniel Kunin. They are all going on to top positions across artificial intelligence, medicine, and physics. Proud of you! https://t.co/8p5P7K2lqf

1

90

1

7

6K

FENG CHEN @FCHEN_AI

over 1 year ago

Proud to be part of the team behind this new open-source SOTA formal math prover! 🚀 Achieving 72.95% on MiniF2F with simple BFS strategy. Our models are trained using expert iteration and DPO, pushing the boundaries of formal theorem proving. 📄 Paper: https://t.co/JqLYiKScZ1

Ran Xin

@RanXinByteDance

over 1 year ago

🚀 Excited to announce BFS-Prover, our state-of-the-art theorem proving system in Lean4! We've achieved 72.95% on the MiniF2F test, surpassing all previous systems including DeepSeek-Prover-v1.5, InternLM2.5-StepProver, and HunyuanProver 📈 🔥 Key innovations: - Simple Best-First Search, rather than complex MCTS - No critic model (value function) needed Our 7B tactic generation model is now available on huggingface: https://t.co/94Oyumzocs Paper: https://t.co/Gr51JMFxee This challenges the perceived necessity that complex search methods are necessary for formal theorem proving. Sometimes simpler is better! @FCHEN_AI

7

5

1

1K

0

2

0

199

FENG CHEN @FCHEN_AI

over 1 year ago

4/ We extend our algorithm to automated theorem proving and math QA with CoT. In theorem proving, our approach improves performance by controlling the exploitation and exploration tradeoff in proof trees. In CoT, where overconfidence is less severe, we also see performance gains.

FCHEN_AI's tweet photo. 4/ We extend our algorithm to automated theorem proving and math QA with CoT. In theorem proving, our approach improves performance by controlling the exploitation and exploration tradeoff in proof trees. In CoT, where overconfidence is less severe, we also see performance gains. https://t.co/S6f34ZqA16

0

2

1

0

168

Who to follow

Daniel Kunin

@KuninDaniel

postdoc @UCB_MillerInst PhD @ICMEStanford creator @SeeingTheory

Chinmaya Kausik

@ChinmayaKausik

ML PhD @UMich. Prev: ML Research intern @JaneStreetGroup, @Netflix, DS intern @Microsoft. Undergrad @iiscbangalore '21. He/him.

Itamar D Landau

@itamarlandau

Founder: Jewish Coalition of Berkeley Computational Neuroscience postdoc at Stanford. Photo: Hagit Caspi

FENG CHEN @FCHEN_AI

over 1 year ago

1/ Our new paper: “Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning” on how to change training to better exploit test-time compute! co-led by @AllanRaventos, w/ Nan Cheng, @SuryaGanguli & @ShaulDr https://t.co/xM49OB6sk7

1

18

5

11

9K

FENG CHEN @FCHEN_AI

over 1 year ago

3/ We propose directly optimizing for coverage in the fine-tuning loss with Direct Coverage Optimization (DCO). DCO attenuates gradients on high-confidence samples, regularizing away from overconfidence. We demonstrate superior accuracy frontiers over CE loss in MATH and MiniF2F.

FCHEN_AI's tweet photo. 3/ We propose directly optimizing for coverage in the fine-tuning loss with Direct Coverage Optimization (DCO). DCO attenuates gradients on high-confidence samples, regularizing away from overconfidence. We demonstrate superior accuracy frontiers over CE loss in MATH and MiniF2F. https://t.co/6SDj1hhyYQ

1

0

232

FCHEN_AI retweeted

Multiscale AI @MultiscaleAI

over 1 year ago

Join us at the ML for Multiscale Processes workshop at #ICLR2025 to hear from our three first amazing keynotes: Qianxiao Li https://t.co/ZQG6acjlwK Sergei Gukov https://t.co/2E5KjTpHxq Charlotte Bunne https://t.co/ip4QAvcXNz

0

1

2

0

143

FCHEN_AI retweeted

Daniel Kunin @KuninDaniel

over 1 year ago

Come check out our #NeurIPS2024 spotlight poster on feature learning tomorrow! 📍East Exhibit Hall A-C #2102 📅Thu 12 Dec 4:30 p.m. — 7:30 p.m. PST

0

49

7

20

10K

FCHEN_AI retweeted

Daniel Kunin @KuninDaniel

over 2 years ago

Want to learn about SGD's implicit bias towards simpler subnetworks generated by permutation symmetry?! Come to our NeurIPS poster session tomorrow morning 10:45 - 12:45 Hall B1+B2 (level 1) #906

1

75

7

40

14K

FCHEN_AI retweeted

Surya Ganguli

@SuryaGanguli

almost 3 years ago

1/ Our new paper lead by @AllanRaventos @mansiege , @FCHEN_AI asks when in-context learning of regression can solve fundamentally *new* problems *not* seen during pre-training, and reveals it as an emergent capability arising from a phase transition... https://t.co/gqaioAUL7Q

4

167

39

83

32K

FENG CHEN @FCHEN_AI

almost 3 years ago

Excited to share our new work on how pretraining task diversity affects in-context learning.

Mansheej Paul

@mansiege

almost 3 years ago

Can in-context learning learn new tasks different from those in the pretraining data? Is this an emergent ability, i.e. does it arise from pretraining without being explicitly optimized for? How does this depend on pretraining task diversity? 🧵 1/ https://t.co/g118pWgAA9

4

187

49

116

44K

0

4

0

312

FENG CHEN @FCHEN_AI

almost 3 years ago

Excited to share our new paper on how SGD biases towards simpler models via stochastic collapse to the invariant sets.

Daniel Kunin @KuninDaniel

almost 3 years ago

Our new preprint reveals how SGD biases neural nets towards vastly simpler subnets w/ superior generalization via stochastic collapse to invariant sets & explains why prolonged large learning rates help co-led w/ @FCHEN_AI @atsushi_y1230 & @SuryaGanguli https://t.co/tJKWp1Neng

4

262

53

141

62K

0

4

2

0

835

FENG CHEN

@FCHEN_AI

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users