Huayang Li @HuayangLi - Twitter Profile

Huayang Li @HuayangLi

15 days ago

@a1zhang @sciam @lateinteraction @MIT_CSAIL @PrimeIntellect @LaudeInstitute Congrats Alex!

0

47

Huayang Li @HuayangLi

2 months ago

@shengranhu @yimingxiong_ Congrats Shengran!!!

1

2

0

58

HuayangLi retweeted

Shengran Hu @shengranhu

3 months ago

Thrilled to share that The AI Scientist is published in Nature! Huge thanks to all the amazing collaborators who made this happen.✨

1

37

2

5

4K

Huayang Li @HuayangLi

4 months ago

@sjh4i passport? that’s weird 😅

1

0

72

Who to follow

Xincan Feng

@xincanfeng

PhD / applied mathematician / knowledge graphs, retrievers, LLMs, audio, medical/ JST SPRING / am looking for jobs

Hiroyuki Deguchi

@de9uch1_

Decoding, Machine Translation, kNN @ NTT CS lab ex. NAIST, NICT, Mantra, NII/LLMC, Ehime Univ. Gentoo / LISP / Rust / bebop jazz pianist

yusuke_sakai

@yusuke1997

自然言語処理やってるはず...週8筋トレゴリラ。進捗は筋肉です！将来の夢はボディービルダー！ NAIST。たぶん助教。発言は個人の見解に基づくものであり、所属組織を代表するものではありません。

HuayangLi retweeted

Simon Guo

@simonguozirui

5 months ago

Thanks for featuring our open-source RL repo for KernelBench (built together with @NatKokoromyti @ethanboneh during winter break!) Powered by two of my fav tools lately: 🔧 @tinkerapi seamlessly handles distributed RL training, letting us easily try using larger models while accommodating long step times 📦 @modal lets us quickly spin up GPU sandboxes with complex dependencies, enabling consistent evaluation while scaling up rollouts in parallel Together, they make it easy to build a flexible and scalable RL loop that trains models to write better ⚡GPU kernels. Looking back, Kevin (https://t.co/ZutFiFcLjk) could have been built and trained so much faster with this setup! Excited to see what the community does with it — thanks to the @AMD team for proposing support for their hardware!

1

77

16

29

11K

Huayang Li @HuayangLi

5 months ago

@nlp_yiran All the best Yiran!

0

236

HuayangLi retweeted

Huayang Li @HuayangLi

5 months ago

@SakanaAILabs @bendee983 I’d like to share an interesting demo of RePo https://t.co/jaJPIaTQV1, which shows how it learns adaptive position ids based on difference input structures at test time, e.g., tables or code. It also shows patterns of NoPE (const. id) and RoPE (linear id) at fine-grained level.

0

3

1

0

248

Huayang Li @HuayangLi

5 months ago

@SakanaAILabs @bendee983 I’d like to share an interesting demo of RePo https://t.co/jaJPIaTQV1, which shows how it learns adaptive position ids based on difference input structures at test time, e.g., tables or code. It also shows patterns of NoPE (const. id) and RoPE (linear id) at fine-grained level.

0

3

1

0

248

Huayang Li @HuayangLi

6 months ago

@alex_peys @SakanaAILabs @hardmaru Ohh, very interesting work! I think it is another support for this direction. Will cite and discuss in the updated version.

0

1

0

62

Huayang Li @HuayangLi

6 months ago

@andysingal @SakanaAILabs Actually RePo is not specifically for RoPE. It can be used for almost all the position encoding methods. The difference is, position encoding maps position ids to embeddings/bias values, while RePo is an module to dynamically assign position ids for tokens

0

54

Huayang Li @HuayangLi

6 months ago

Thanks for sharing 🙌

Sakana AI

@SakanaAILabs

6 months ago

Introducing RePo: Language Models with Context Re-Positioning Website: https://t.co/d9JUjPIyYt Paper: https://t.co/GhTRTFosuy Standard language models process information as a rigid linear sequence where the only signal for structure is a fixed token index, forcing them to treat physical proximity as semantic relevance. Cognitive Load Theory suggests this is inefficient. Just as humans struggle when key facts are buried in noise, models waste finite capacity managing disorganized inputs instead of focusing on deep reasoning. RePo breaks this bottleneck by allowing models to actively reorganize their context. Instead of using a fixed index, our module learns to assign positions based on content relevance. This lets the model dynamically pull relevant distant information closer and push noise away, effectively reshaping the attention geometry to match the problem structure. This flexibility yields significant gains in robustness. RePo outperforms standard encodings on noisy contexts, structured data, and long-range dependencies while maintaining competitive general performance. It represents a step toward models that intelligently curate their own working memory rather than passively accepting input order.

27

562

84

333

61K

0

9

4

1

1K

Huayang Li @HuayangLi

over 1 year ago

@timhuangxt Nice work!

0

100

HuayangLi retweeted

奈良先端科学技術大学院大学 @NAIST_MAIN

about 2 years ago

奈良先端大（NAIST）と名前が似ている研究機関 6月29日時点 #奈良先端大 #NAIST

0

60

29

3

14K

Huayang Li @HuayangLi

about 2 years ago

@shizhediao Wow congrats!🎉🎉🎉

1

0

124

Huayang Li @HuayangLi

about 2 years ago

@jodieyzhou @CompScienceCU @Cardiff_NLP Congrats!!!

1

0

163

HuayangLi retweeted

Jiannan Xiang

@szxiangjn

about 2 years ago

🔥 We are excited to announce 𝗣𝗮𝗻𝗱𝗼𝗿𝗮, a world model with natural language actions and video states. 🌏 It is a step towards a General World Model that: 1. Simulates world states by generating videos across any domains 2. Allows any-time control with free-text actions

6

176

51

82

33K

Huayang Li @HuayangLi

over 2 years ago

The retirement presentation of Prof. Nakamura, introducing his career on speech & language processing from 1986 to 2024 🫡

0

15

0

916

Huayang Li @HuayangLi

over 2 years ago

@wangly0229 happy birthday!

0

79

HuayangLi retweeted

Longyue Wang

@wangly0229

over 2 years ago

🌟An inspiring work in long text generation!📝 This work embeds history info directly into model parameters, eliminating the need for KV cache.🚀 Arxiv: https://t.co/PHfp3z5dw5

wangly0229's tweet photo. 🌟An inspiring work in long text generation!📝 This work embeds history info directly into model parameters, eliminating the need for KV cache.🚀
Arxiv: https://t.co/PHfp3z5dw5 https://t.co/6gWXz2VVcC

1

26

13

11

3K

HuayangLi retweeted

Deng Cai @deng_cai

over 2 years ago

We just released ⭐️Inferflow⭐️, an efficient and highly configurable inference engine in c++ for serving various large language models by simply modifying some lines in corresponding configuration files, without writing a single line of source code. https://t.co/qt6s9eOoib

2

10

2

1

1K

Huayang Li

@HuayangLi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users