Wang Shuohang

@shuohangw

Joined August 2014

311 Following

110 Followers

2 Posts

shuohangw retweeted

Liyuan Liu (Lucas)

@LiyuanLucas

almost 2 years ago

GRadient INformation make MoE 😁 achieve 79.4 on MMLU with 6.6B active parameters & correctly answers the straberry question occasionally highlight: - push 16x3.8B to reach 14B capacity - trained experts have expertise - trained routing invents shared expert - sound gradient

LiyuanLucas's tweet photo. GRadient INformation make MoE 😁

achieve 79.4 on MMLU with 6.6B active parameters & correctly answers the straberry question occasionally

highlight:

- push 16x3.8B to reach 14B capacity
- trained experts have expertise
- trained routing invents shared expert
- sound gradient https://t.co/Vfw3VYGR8u

270

146

97K

shuohangw retweeted

Yang Liu @nlpyang

about 2 years ago

Microsoft GenAI is looking for a summer intern to work on Sparse LLMs, if you are interested, please DM me or send a resume to yaliu10 at microsoft dot com

230

171

75K

Wang Shuohang

@shuohangw

Last Seen Users on Sotwe

Trends for you

Most Popular Users