Chuning Li

@ChuningLi

MSc @UofTCompSci @VectorInst

Joined October 2022

49 Following

77 Followers

1 Posts

ChuningLi retweeted

Lorenzo Noci @lorenzo_noci

almost 3 years ago

How do you scale Transformers to infinite depth while ensuring numerical stability? In fact, LayerNorm is not enough. But *shaping* the attention mechanism works! https://t.co/4DbIfYMQr3 w/ @ChuningLi @mufan_li @bobby_he @THofmann2017 @cjmaddison @roydanroy

lorenzo_noci's tweet photo. How do you scale Transformers to infinite depth while ensuring numerical stability? In fact, LayerNorm is not enough.

But *shaping* the attention mechanism works!

https://t.co/4DbIfYMQr3
w/ @ChuningLi @mufan_li @bobby_he @THofmann2017 @cjmaddison @roydanroy https://t.co/ZVAyhPh09g

210

110

81K

Chuning Li

@ChuningLi

Last Seen Users on Sotwe

Trends for you

Most Popular Users