Mansheej Paul @mansiege - Twitter Profile

Pinned Tweet

almost 2 years ago

Check out our new work: Critique-out-Loud (CLoud) reward models where we improve reward models by having them generate a critique for a response before scoring it. Results and details in thread from @ZackAnkner.

Zack Ankner

@ZackAnkner

almost 2 years ago

Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly! https://t.co/CnYEDM36no

14

261

59

154

71K

1

24

1

4

2K

mansiege retweeted

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

7 months ago

100 citations for a paper that taught us many lessons esp about branding, timing, peer review, and pushing the frontier! I'm rather proud of this one. Congrats to @ZackAnkner and @mansiege! https://t.co/GO0phpdWlM

rajammanabrolu's tweet photo. 100 citations for a paper that taught us many lessons esp about branding, timing, peer review, and pushing the frontier! I'm rather proud of this one. Congrats to @ZackAnkner and @mansiege!
https://t.co/GO0phpdWlM https://t.co/WONrnAEqMC

2

70

6

21

8K

Mansheej Paul

@mansiege

8 months ago

The next frontier of AI is where it meets the physical world, generates new hypotheses, and learns from experiments. Excited to join an incredible team in accelerating science and pushing this frontier.

Liam Fedus

@LiamFedus

8 months ago

Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate. Until now, scientific AI advances have come from models trained on the internet. But despite its vastness — it’s still finite (estimates are ~10T text tokens where one English word may be 1-2 tokens). And in recent years the best frontier AI models have fully exhausted it. Researchers seek better use of this data, but as any scientist knows: though re-reading a textbook may give new insights, they eventually need to try their idea to see if it holds. Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act. We’re starting in the physical sciences. Technological progress is limited by our ability to design the physical world. We’re starting here because experiments have high signal-to-noise and are (relatively) fast, physical simulations effectively model many systems, but more broadly, physics is a verifiable environment. AI has progressed fastest in domains with data and verifiable results - for example, in math and code. Here, nature is the RL environment. One of our goals is to discover superconductors that work at higher temperatures than today's materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion. We’re also working to deploy our solutions with industry. As an example, we're helping a semiconductor manufacturer that is facing issues with heat dissipation on their chips. We’re training custom agents for their engineers and researchers to make sense of their experimental data in order to iterate faster. Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to some of the most important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done. We’re fortunate to be backed by investors who share our vision, including @a16z who led our $300M round, as well as @Felicis, DST Global, NVentures (NVIDIA’s venture capital arm), @Accel and individuals including @JeffBezos , @eladgil , @ericschmidt, and @JeffDean. Their support will help us grow our team, scale our labs, and develop the first generation of AI scientists.

LiamFedus's tweet photo. Today, @ekindogus and I are excited to introduce @periodiclabs.

Our goal is to create an AI scientist.

Science works by conjecturing how the world might be, running experiments, and learning from the results.

Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate.

Until now, scientific AI advances have come from models trained on the internet. But despite its vastness — it’s still finite (estimates are ~10T text tokens where one English word may be 1-2 tokens). And in recent years the best frontier AI models have fully exhausted it.

Researchers seek better use of this data, but as any scientist knows: though re-reading a textbook may give new insights, they eventually need to try their idea to see if it holds.

Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act.
We’re starting in the physical sciences.

Technological progress is limited by our ability to design the physical world.

We’re starting here because experiments have high signal-to-noise and are (relatively) fast, physical simulations effectively model many systems, but more broadly, physics is a verifiable environment. AI has progressed fastest in domains with data and verifiable results - for example, in math and code. Here, nature is the RL environment.

One of our goals is to discover superconductors that work at higher temperatures than today's materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion.

We’re also working to deploy our solutions with industry. As an example, we're helping a semiconductor manufacturer that is facing issues with heat dissipation on their chips. We’re training custom agents for their engineers and researchers to make sense of their experimental data in order to iterate faster.

Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to some of the most important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done.

We’re fortunate to be backed by investors who share our vision, including @a16z who led our $300M round, as well as @Felicis, DST Global, NVentures (NVIDIA’s venture capital arm), @Accel and individuals including @JeffBezos , @eladgil , @ericschmidt, and @JeffDean. Their support will help us grow our team, scale our labs, and develop the first generation of AI scientists.

426

4K

435

1K

4M

2

20

5

1

3K

Mansheej Paul

@mansiege

9 months ago

@typedfemale Well... only if we happen upon space cocaine in the sand...

0

2

0

529

Who to follow

Research scientist at MosaicML/Databricks. PhD from UW-Madison. Interested in LLMs, optimization, and the meaning of life.

mansiege retweeted

10 months ago

It was kinda a movie

2

48

6

4

5K

Mansheej Paul

@mansiege

11 months ago

Imagine if memory pointers had twitter. They’d be like “@malloc is this true?”

Epimenid

@truthful_cretan

11 months ago

Imagine if Linux kernel interfaces had twitter. They’d be like “/proc is this true?”

1

7

1

3K

1

11

1

0

2K

mansiege retweeted

Misha Laskin

@MishaLaskin

11 months ago

Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.

98

1K

174

1K

369K

Mansheej Paul

@mansiege

11 months ago

Imagine if threads had twitter. They’d be like “@lock can I do?”

Mansheej Paul

@mansiege

11 months ago

Imagine if boats had twitter. They’d be like “@dock is this true?”

0

4

1

2K

1

8

1

0

3K

Mansheej Paul

@mansiege

11 months ago

Imagine if boats had twitter. They’d be like “@dock is this true?”

Cody Blakeney

@code_star

11 months ago

Imagine if soup had twitter. They'd all be like "@stock is this true?"

0

4

0

887

0

4

1

2K

mansiege retweeted

Davis Blalock

@davisblalock

11 months ago

Deep learning training is a mathematical dumpster fire. But it turns out that if you *fix* the math, everything kinda just works…fp8 training, hyperparameter transfer, training stability, and more. [1/n]

davisblalock's tweet photo. Deep learning training is a mathematical dumpster fire.

But it turns out that if you *fix* the math, everything kinda just works…fp8 training, hyperparameter transfer, training stability, and more. [1/n] https://t.co/wtiPo5pFsL

15

1K

147

2K

189K

Mansheej Paul

@mansiege

11 months ago

@code_star Tbh, sounds like a great guy!

0

2

0

631

Mansheej Paul

@mansiege

11 months ago

@nsaphra You should check out Victor Pelevin. Omon Ra is great!

0

1

0

65

Mansheej Paul

@mansiege

11 months ago

@nsaphra This is such an awesome book. Everything by these authors honestly.

1

0

76

Mansheej Paul

@mansiege

about 1 year ago

@code_star https://t.co/YMBWsOwaqe

0

1

0

110

Mansheej Paul

@mansiege

about 1 year ago

@rajammanabrolu I thought grpo is agi now?

1

3

0

152

mansiege retweeted

Dan Biderman

@dan_biderman

over 1 year ago

How can we use small LLMs to shift more AI workloads onto our laptops and phones? In our paper and open-source code, we pair on-device LLMs (@ollama) with frontier LLMs in the cloud (@openai, @together), to solve token-intensive workloads on your 💻 at 17.5% of the cloud cost while maintaining 97.9% of the accuracy. See Gru and the Minions in action below, 🔉on please (h/t @cartesia)!

41

635

169

495

193K

mansiege retweeted

Core Francisco Park

@corefpark

over 1 year ago

💥New Paper! Algorithmic Phases of In-Context Learning: We show that transformers learn a superposition of different algorithmic solutions depending on the data diversity, training time and context length! 1/n

corefpark's tweet photo. 💥New Paper!
Algorithmic Phases of In-Context Learning:

We show that transformers learn a superposition of different algorithmic solutions depending on the data diversity, training time and context length!

1/n https://t.co/jFl1QAx0tf

7

423

61

337

37K

mansiege retweeted

Zack Ankner

@ZackAnkner

over 1 year ago

Critique out loud reward models made it into the Kimi k1.5 technical report! Super cool to see someone scale it up to 800k inputs and to see how much better reward modeling it led to!

ZackAnkner's tweet photo. Critique out loud reward models made it into the Kimi k1.5 technical report! Super cool to see someone scale it up to 800k inputs and to see how much better reward modeling it led to! https://t.co/WBVOinosUm

2

62

8

14

4K

mansiege retweeted

Cody Blakeney

@code_star

over 1 year ago

If you want to read more about the curriculum training used in OLMo 2 checkout our (@mansiege @_BrettLarsen Sean Owen) paper! Congrats on the release to everyone at AI2! (but especially @soldni and @kylelostat <3 data ) https://t.co/e0V5B4TxTS

code_star's tweet photo. If you want to read more about the curriculum training used in OLMo 2 checkout our (@mansiege @_BrettLarsen Sean Owen) paper!

Congrats on the release to everyone at AI2! (but especially @soldni and @kylelostat <3 data )

https://t.co/e0V5B4TxTS https://t.co/2hyQgm9XG1

1

49

8

20

9K

mansiege retweeted

Zack Ankner

@ZackAnkner

over 1 year ago

Agreed ;) But in all seriousness, its cool to see everyone converging on reward models that perform explicit reasoning by critiquing out loud. Super excited to see how people build on top of these works.

ZackAnkner's tweet photo. Agreed ;)

But in all seriousness, its cool to see everyone converging on reward models that perform explicit reasoning by critiquing out loud. Super excited to see how people build on top of these works. https://t.co/gf1i4glaAQ

2

53

9

36

12K

Mansheej Paul

@mansiege

over 1 year ago

Code and models for our latest work Critique-out-Loud (CLoud) Reward models is now released! Check out our paper (https://t.co/SQOQYGe27y) for more details on using reward models to reason before predicting a reward score.

Zack Ankner

@ZackAnkner

over 1 year ago

Code and models for Critique-out-Loud (CLoud) reward models are finally public! The repo comes with a gradio demo you can run, so hopefully people can mess around with the models 😃 Code: https://t.co/oJJgC5M67f

ZackAnkner's tweet photo. Code and models for Critique-out-Loud (CLoud) reward models are finally public! The repo comes with a gradio demo you can run, so hopefully people can mess around with the models 😃

Code: https://t.co/oJJgC5M67f https://t.co/ANvPrlQcI8

2

39

4

9

9K

3

22

2

4

4K

Mansheej Paul

@mansiege

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users