Pingzhi Li @pingzli - Twitter Profile

Pingzhi Li @pingzli

about 2 months ago

@hanrui_w huge congrats!!

0

147

pingzli retweeted

Tencent Hy

@TencentHunyuan

4 months ago

One static model does not fit all😭 We just dropped our latest work: Functional Neural Memory. Instead of static models, we generate custom "parameters" for every single input. ✅Prompt your model anytime ✅Instant personalization ✅Better instruction following ✅Flexible & dynamic memory (w/o memory bank✌️) (🧵1/6)

11

342

138

202

74K

Pingzhi Li @pingzli

7 months ago

@Yuchenj_UW a slow manual best of N sampling

0

14

pingzli retweeted

Junyuan "Jason" Hong

@hjy836

7 months ago

I am going to give a talk at the Lock-LLM #NeurIPS2025 workshop ( https://t.co/RDyhhRWaTY ) at Room 1AB (Upper Level) 10 am (PST). Join me to discuss the dark knowledge for LLMs.

0

4

1

0

490

Who to follow

#SDVOSB experienced in supporting both commercial & #defense customers. Specialize in #cybersecurity, #vCISO, #crossdomain, #IOPsych, #DataAnalysis

sachin chandna

@sachinc78

An architect turned banker. From designing living spaces to financial landscaping. Self proclaimed tech, data & innovation enthusiast. Strategic investor.

pingzli retweeted

Marco Mascorro

@Mascobot

7 months ago

If the #NeurIPS2025 app is crashing for you (like it’s for me) to the point that’s unusable, here a website with all the content/sessions: https://t.co/u3OyKEa6vt

9

78

9

33

12K

pingzli retweeted

Mohit Bansal

@mohitban47

8 months ago

🚨 🤯 Wow! Yi Lin is an amazing researcher, who works on very hard and important problems in LLM and VLM training, RL, PEFT, Quantization, etc. -- ironically, he had several other top offers just a few months ago! Hire him ASAP if you want to pick up a top talent (and several other affected amazing folks)! 👇👇

5

159

31

39

46K

Pingzhi Li @pingzli

8 months ago

@pmddomingos I thought this is common sense… post-training (eg, GRPO, rejection sampling) can never surpass best-of-N

0

14

Pingzhi Li @pingzli

10 months ago

@srinath_namburi @Dawei_Li_ASU It’s been fixed !

0

1

0

51

pingzli retweeted

Eigen AI

@Eigen_AI_Labs

11 months ago

🚀Founded by four dedicated MIT graduates, Eigen AI is the world's first company focusing on AEI – Artificial Efficient Intelligence, making AI accessible for all. Today OpenAI dropped GPT-OSS. We teamed up with our partners SGLang @lmsysorg and @NVIDIA to deliver open-source support of the model with blazing-fast performance on Hopper and Blackwell GPUs just within 4 hours of the release. 🔥 With @YottaLabs, we're stoked to launch a free GPT-OSS-120B playground chatbot & API at https://t.co/BQfsnXIGFo 🚀 Easy-to-use, high-performance, and ready for your projects. Share with us what you are building with it! 🌟 Join us to unlock AI’s potential. Let’s democratize efficient AI for everyone! 💪 #AI #Innovation #EfficientAI #Chatgpt #GPT #performance #LLM #openai #eigenai

Eigen_AI_Labs's tweet photo. 🚀Founded by four dedicated MIT graduates, Eigen AI is the world's first company focusing on AEI – Artificial Efficient Intelligence, making AI accessible for all.

Today OpenAI dropped GPT-OSS. We teamed up with our partners SGLang @lmsysorg and @NVIDIA to deliver open-source support of the model with blazing-fast performance on Hopper and Blackwell GPUs just within 4 hours of the release. 🔥

With @YottaLabs, we're stoked to launch a free GPT-OSS-120B playground chatbot & API at https://t.co/BQfsnXIGFo 🚀 Easy-to-use, high-performance, and ready for your projects. Share with us what you are building with it! 🌟

Join us to unlock AI’s potential. Let’s democratize efficient AI for everyone! 💪 #AI #Innovation #EfficientAI #Chatgpt #GPT #performance #LLM #openai #eigenai

4

70

21

18

20K

pingzli retweeted

Eigen AI

@Eigen_AI_Labs

11 months ago

🚀Excited to see our collaboration with @lmsysorg bring Multiple Token Prediction (MTP) in SGLang to production! Proud to support faster, smarter open-source LLM serving. #EigenAl #MTP #SGLang #LLMinfra #ModelServing #DeepSeek #OpenSourceAl #AskChatGPT

0

10

4

1

2K

Pingzhi Li @pingzli

12 months ago

@prateeky2806 @AIatMeta Congrats Prateek!

0

1

0

61

pingzli retweeted

Victor.Kai Wang @VictorKaiWang1

about 1 year ago

Customizing Your LLMs in seconds using prompts🥳! Excited to share our latest work with @HPCAILab, @VITAGroupUT, @k_schuerholt, @YangYou1991, @mmbronstein, @damianborth : Drag-and-Drop LLMs(DnD). 2 features: tuning-free, comparable or even better than full-shot tuning.(🧵1/8)

5

113

75

61

18K

pingzli retweeted

Ruoming Pang

@ruomingpang

about 1 year ago

At WWDC we introduce a new generation of LLMs developed to enhance the Apple Intelligence features. We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device foundation language model. https://t.co/SnjCXrIyYj

89

485

105

188

81K

Pingzhi Li @pingzli

about 1 year ago

@jaeh0ng_yoon @NTUsg Huge congrats Jaehong !!

1

0

77

pingzli retweeted

CJ Zafir

@cjzafir

about 1 year ago

Cursor Agent is just wild. Now i use Gemini PRO 2.5 to scan the codebase and sonnet 3.5/3.7 to execute code. In this workflow you need 3 things: 1. Detailed project documentation 2. Use multiple AI coding models 3. 50-step implementation plan I spend 30 hours/week on cursor. I've found out the best cursor practices and the workflow. I attached the best practices below and here's the best workflow. Your project docs (PRD, Tech stack & APIs doc, app flow doc stec) works like a knowledge base for AI models. If AI models find all necessary information within the knowledge base, they don't hallucinate, assume things and don't ruin the codebase. So must add project docs in your root directory. Ideal place is add them under project rules (.cursor/rules) Then you need to use multiple AI models. Now I am using Gemini PRO 2.5 to scan the entire codebase (cus it has 1M context) and find errors or update docs. And I use Sonnet 3.5 to execute code. If it's a bit complex step then I also use Sonnet 3.7. Sometimes I also use GPT o1 model to debug but rarely (mostly done by Gemini pro 2.5) So 2.5 to scan, update, and 3.5/3.7 to execute. Each model has its superpowers. We need to maximize those. Lastly, you need to write an end-to-end plan to code your app. I call it "implementation plan." This implementation plan works as a blueprint for Cursor Agent and it just follows the tasks and executes those. I use @CodeGuidedev to generate coding docs + it provides 50-step implementation plan to code the entire app. Now it also supports MCPs. Imagine Cursor using Supabase MCP to create database tables, and add policies autonomously. It just saves so much time. So wrap up of the workflow is: Attach your coding docs + use multiple AI models in your flow + have a solid 50-step implementation plan. And you'll see how powerful Cursor Agent is. I hope this'll refine your Cursor coding workflow. Let me know your findings.

cjzafir's tweet photo. Cursor Agent is just wild.

Now i use Gemini PRO 2.5 to scan the codebase and sonnet 3.5/3.7 to execute code.

In this workflow you need 3 things:
1. Detailed project documentation
2. Use multiple AI coding models
3. 50-step implementation plan

I spend 30 hours/week on cursor. I've found out the best cursor practices and the workflow.

I attached the best practices below and here's the best workflow.

Your project docs (PRD, Tech stack & APIs doc, app flow doc stec) works like a knowledge base for AI models.

If AI models find all necessary information within the knowledge base, they don't hallucinate, assume things and don't ruin the codebase.

So must add project docs in your root directory. Ideal place is add them under project rules (.cursor/rules)

Then you need to use multiple AI models. Now I am using Gemini PRO 2.5 to scan the entire codebase (cus it has 1M context) and find errors or update docs.

And I use Sonnet 3.5 to execute code. If it's a bit complex step then I also use Sonnet 3.7.

Sometimes I also use GPT o1 model to debug but rarely (mostly done by Gemini pro 2.5)

So 2.5 to scan, update, and 3.5/3.7 to execute.

Each model has its superpowers. We need to maximize those.

Lastly, you need to write an end-to-end plan to code your app. I call it "implementation plan."

This implementation plan works as a blueprint for Cursor Agent and it just follows the tasks and executes those.

I use @CodeGuidedev to generate coding docs + it provides 50-step implementation plan to code the entire app.

Now it also supports MCPs. Imagine Cursor using Supabase MCP to create database tables, and add policies autonomously. It just saves so much time.

So wrap up of the workflow is:

Attach your coding docs + use multiple AI models in your flow + have a solid 50-step implementation plan.

And you'll see how powerful Cursor Agent is.

I hope this'll refine your Cursor coding workflow. Let me know your findings.

46

3K

317

6K

258K

pingzli retweeted

Prateek Yadav

@prateeky2806

over 1 year ago

I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇

8

232

58

60

66K

Pingzhi Li @pingzli

over 1 year ago

@Zhang_NanoAlum remarkable! congrats!!

0

49

pingzli retweeted

Ruisi Cai @ccccrs_0908

over 1 year ago

With countless open-source LLM checkpoints available, each specializing in unique domain knowledge, how can we tap into their full potential? Check out Model-GLUE! 🚀 We introduce a framework that integrates model merging, mixture, and stacking to unlock new possibilities.

2

16

4

1

2K

pingzli retweeted

VITA Group @VITAGroupUT

over 1 year ago

1/ 🌟 Excited to announce #Model-#GLUE (#neurips2024 D&B), a new framework designed by an extensive team from UNC, UMD, UT Austin, HKUST, Google, and CMU to #scale pre-trained LLMs efficiently! 🚀 Tackling the challenge of #aggregating disparate pre-trained LLM, we introduce a holistic guideline and benchmarking if you have a large, diverse model zoo "in the wild"! #LLM #AIresearch

VITAGroupUT's tweet photo. 1/ 🌟 Excited to announce #Model-#GLUE (#neurips2024 D&B), a new framework designed by an extensive team from UNC, UMD, UT Austin, HKUST, Google, and CMU to #scale pre-trained LLMs efficiently!

🚀 Tackling the challenge of #aggregating disparate pre-trained LLM, we introduce a holistic guideline and benchmarking if you have a large, diverse model zoo "in the wild"! #LLM #AIresearch

1

22

7

4

8K

pingzli retweeted

Jaehong Yoon

@jaeh0ng_yoon

over 1 year ago

🎉 Check out our new preprint - GLIDER! How do we solve held-in/-out tasks with a collection of specialized experts like LoRA? 🤔 GLIDER Improves generalization across held-in and held-out tasks by combining the power of: 🪂LLM-guided task instructions for global routing btw specialized experts. 🪂Local token-level routers refine the expert selection, optimizing which modules contribute.

0

19

8

1

3K

Pingzhi Li

@pingzli

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users