Yang Liu

@nlpyang

#LLM Researcher @Microsoft; PhD @EdinburghNLP

Bellevue, WA

Joined December 2021

345 Following

1.5K Followers

79 Posts

Pinned Tweet

Yang Liu @nlpyang

22 days ago

Excited to introduce our MAI Code model at Microsoft Build. As shared in the session, this is a MoE (5B active / 137B total) initialized from an MAI pretrained model and trained for real user scenarios with product harnesses. I’m proud to have served as the research lead for this effort, and even prouder of what the team has achieved. It’s a beast for its size. Stay tuned — a larger model could come :)

nlpyang's tweet photo. Excited to introduce our MAI Code model at Microsoft Build. As shared in the session, this is a MoE (5B active / 137B total) initialized from an MAI pretrained model and trained for real user scenarios with product harnesses. I’m proud to have served as the research lead for this effort, and even prouder of what the team has achieved. It’s a beast for its size. Stay tuned — a larger model could come :)

22 days ago

MAI-Code-1-Flash is here! Built and optimized for GitHub Copilot. From quick fixes to complex engineering challenges, write better code with more return on token. Rolling out to GitHub Copilot individual users in Visual Studio Code in the model picker and under the default auto picker now.

MicrosoftAI's tweet photo. MAI-Code-1-Flash is here! Built and optimized for GitHub Copilot. From quick fixes to complex engineering challenges, write better code with more return on token. Rolling out to GitHub Copilot individual users in Visual Studio Code in the model picker and under the default auto picker now.

31

742

113

159

86K

2

38

7

3

4K

Yang Liu @nlpyang

22 days ago

More details here https://t.co/Orj8uXGjxI

0

0

0

0

313

Yang Liu @nlpyang

22 days ago

Excited to introduce our MAI Code model at Microsoft Build. As shared in the session, this is a MoE (5B active / 137B total) initialized from an MAI pretrained model and trained for real user scenarios with product harnesses. I’m proud to have served as the research lead for this effort, and even prouder of what the team has achieved. It’s a beast for its size. Stay tuned — a larger model could come :)

nlpyang's tweet photo. Excited to introduce our MAI Code model at Microsoft Build. As shared in the session, this is a MoE (5B active / 137B total) initialized from an MAI pretrained model and trained for real user scenarios with product harnesses. I’m proud to have served as the research lead for this effort, and even prouder of what the team has achieved. It’s a beast for its size. Stay tuned — a larger model could come :)

22 days ago

MAI-Code-1-Flash is here! Built and optimized for GitHub Copilot. From quick fixes to complex engineering challenges, write better code with more return on token. Rolling out to GitHub Copilot individual users in Visual Studio Code in the model picker and under the default auto picker now.

MicrosoftAI's tweet photo. MAI-Code-1-Flash is here! Built and optimized for GitHub Copilot. From quick fixes to complex engineering challenges, write better code with more return on token. Rolling out to GitHub Copilot individual users in Visual Studio Code in the model picker and under the default auto picker now.

31

742

113

159

86K

2

38

7

3

4K

nlpyang retweeted

3 months ago

Introducing HyperP, a scaling framework gives you better compute efficiency & transferable stability. At 6e21 FLOPs, HyperP reaches 1.58× compute efficiency over a strong Muon baseline; +MoE further gets 3.38× over dense. Gains even grow with scale🤯 📖: https://t.co/gZRvX9qAIi

liliang_ren's tweet photo. Introducing HyperP, a scaling framework gives you better compute efficiency & transferable stability. At 6e21 FLOPs, HyperP reaches 1.58× compute efficiency over a strong Muon baseline; +MoE further gets 3.38× over dense. Gains even grow with scale🤯

📖: https://t.co/gZRvX9qAIi https://t.co/KMWLhsOKJC

7

274

41

211

31K

Who to follow

Verified account

Researcher of AI. Assistant Professor @Tsinghua_Uni. Working on scalable methods of language and physical models.

Verified account

Building intelligence @xAI. Grok-2🍍, 3🍫, 4🫐, Video Gen🪄. PhD from UIUC CS.

Bill Yuchen Lin

Verified account

RL for coding @xAI @SpaceX Affiliate Assistant Prof @UW. Ex: @allen_ai; Google, Meta FAIR.

nlpyang retweeted

over 1 year ago

Missing coding data in your R1? Introducing KodCode 🐱: a diverse, challenging, and verifiable synthetic dataset for LLM coding! With 447K verified question-solution-test triplets, KodCode is designed for supervised fine-tuning (SFT) and reinforcement learning (RL). 💡Key Features ✨Diverse & Challenging: 5 synthesis methods, 12 subsets covering multiple domains (algorithms to package-specific knowledge) and difficulty levels (basic exercises to competitive programming tasks). ✨Verifiable Correctness: Question-solution-test triplets are systematically validated via a self-verification process with GPT-4o. ✨ Supports RL & SFT: Unit tests enable RL tuning, plus verified CoT responses generated by DeepSeek-R1 🐳 via reject sampling to support SFT. --------------- >> Project Page: https://t.co/9IL4gEofmH >> KodCode-V1 (for RL): https://t.co/CWE7SRBZXA >> KodCode-V1-SFT-R1 (for SFT): https://t.co/4kK5eAhpCs >> Paper: https://t.co/9J8uLMy3CU >> Codebase for creating this dataset: https://t.co/EwKHpbdMsM Thanks to my great mentor @nlpyang for invaluable guidance and support on this project! 🤩 [1/5]

1

18

3

8

9K

Yang Liu @nlpyang

over 1 year ago

Great work from @zhangchen_xu

0

1

0

0

279

Yang Liu @nlpyang

over 1 year ago

Missing coding data in your R1? 🔥 Introducing KodCode—the largest verified synthetic coding dataset for Code LLM training! • 447K question–solution–test triplets • 12 diverse subsets • 10-trial solution verification for rock-solid correctness https://t.co/HILpz70PTT

nlpyang's tweet photo. Missing coding data in your R1?

🔥 Introducing KodCode—the largest verified synthetic coding dataset for Code LLM training!

• 447K question–solution–test triplets
• 12 diverse subsets
• 10-trial solution verification for rock-solid correctness

https://t.co/HILpz70PTT

3

18

3

6

2K

Yang Liu @nlpyang

over 1 year ago

I don't really understand why people think RL takes less compute than pretrain.

2

5

0

0

634

Yang Liu @nlpyang

almost 2 years ago

Check our paper so you can really challenge an LLM

Yulong Chen @Yulongchen1010

almost 2 years ago

Evaluating LLMs usually requires sophisticated human designs and with the continuous improvement of LLMs, it is difficult for humans to find their limitations. Can LLMs find their own limitations by proposing questions to themselves? Check our new paper: https://t.co/lDkBCQsIVR

5

95

22

39

9K

0

8

0

1

1K

Yang Liu @nlpyang

about 2 years ago

@kdcreer Ah, thank you for saving this post

0

0

0

0

1K

Yang Liu @nlpyang

about 2 years ago

Microsoft GenAI is looking for a summer intern to work on Sparse LLMs, if you are interested, please DM me or send a resume to yaliu10 at microsoft dot com

6

229

39

171

75K

Yang Liu @nlpyang

about 2 years ago

If you want to know our recent research efforts, check the great Samba model https://t.co/JaUeaiPcnh

about 2 years ago

Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯 Paper: https://t.co/KwMpeyaDxc (1/6)

liliang_ren's tweet photo. Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯

Paper: https://t.co/KwMpeyaDxc

(1/6)

31

2K

267

1K

244K

0

7

0

1

6K

Yang Liu @nlpyang

about 2 years ago

I truly believe now it is the time that you change all your local attentions to SSMs :)

about 2 years ago

Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯 Paper: https://t.co/KwMpeyaDxc (1/6)

liliang_ren's tweet photo. Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯

Paper: https://t.co/KwMpeyaDxc

(1/6)

31

2K

267

1K

244K

1

12

0

2

2K

Yang Liu @nlpyang

about 2 years ago

not multilingual enough :)

about 2 years ago

GPT-4o is also natively multilingual. We've improved our tokenizer for various languages, resulting in, for example, a 3-4x speed improvement for Indian languages.

athyuttamre's tweet photo. GPT-4o is also natively multilingual. We've improved our tokenizer for various languages, resulting in, for example, a 3-4x speed improvement for Indian languages. https://t.co/rcmDiooFBF

2

30

1

3

4K

0

2

0

2

1K

Yang Liu @nlpyang

about 2 years ago

@BorisMPower So proud we did this❤️

0

4

0

0

1K

Yang Liu @nlpyang

over 2 years ago

If you are at NeurIPS, you could check out our poster about Efficient Transformer this evening.

nlpyang's tweet photo. If you are at NeurIPS, you could check out our poster about Efficient Transformer this evening. https://t.co/zkg41cJPSK

0

20

0

2

2K

Yang Liu @nlpyang

over 2 years ago

spent some time looking for Gemini's "uncertainty routed" prompt. but still no clue

over 2 years ago

The top line number for MMLU is a bit gamed - Gemini is actually worse than GPT-4 when compared on normal few shot or chain of thought

4

72

4

8

6K

0

0

0

0

912

Yang Liu @nlpyang

over 2 years ago

Wow, google really cares a lot about MMLU 🧐

nlpyang's tweet photo. Wow, google really cares a lot about MMLU 🧐 https://t.co/JLltlnzpna

0

3

0

0

649

Yang Liu @nlpyang

over 2 years ago

I have to admit, writing ICLR meta reviews is much more fun than writing ARR ones. What happened here?

0

9

0

0

1K

Yang Liu @nlpyang

over 2 years ago

After 1 year birthday of ChatGPT, you think it is a good thing or a bad thing for NLP research?

0

0

0

0

642

Last Seen Users on Sotwe

Trends for you

Most Popular Users