I gave a talk at Seoul National University.
I titled the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field.
Video: https://t.co/vumzAtUvBl
Slides: https://t.co/IidLe4JfrC
Trying to summarize the field forced me to think about what really matters in the field. While scaling undeniably stands out, its far-reaching implications are more nuanced. I share my thoughts on scaling from three angles:
1) Change in perspective is necessary because some abilities only emerge at a certain scale. Even if some abilities don’t work with the current generation LLMs, we should not claim that it doesn’t work. Rather, we should think it doesn’t work yet. Once larger models are available many conclusions change.
This also means that some conclusions from the past are invalidated and we need to constantly unlearn intuitions built on top of such ideas.
2) From first-principles, scaling up the Transformer amounts to efficiently doing matrix multiplications with many, many machines. I see many researchers in the field of LLM who are not familiar with how scaling is actually done. This section is targeted for technical audiences who want to understand what it means to train large models.
3) I talk about what we should think about for further scaling (think 10000x GPT-4 scale). To me scaling isn’t just doing the same thing with more machines. It entails finding the inductive bias that is the bottleneck in further scaling.
I believe that the maximum likelihood objective function is the bottleneck in achieving the scale of 10000x GPT-4 level. Learning the objective function with an expressive neural net is the next paradigm that is a lot more scalable. With the compute cost going down exponentially, scalable methods eventually win. Don’t compete with that.
In all of these sections, I strive to describe everything from first-principles. In an extremely fast moving field like LLM, no one can keep up. I believe that understanding the core ideas by deriving from first-principles is the only scalable approach.
🇰🇷🏆
PSG의 이강인 선수를 포함한 대한민국 남자 축구 U23 대표팀이 아시안게임 결승전에서 승리를 거두고 금메달을 획득했습니다! 승리를 진심으로 축하합니다 ❤️💙
South Korea’s U23, including PSG’s Kang-In Lee, wins the gold for the Asian Games. Congratulations ❤️💙
죄송 합니다! 제가 알지 못하는 분들이 거의 대부분 입니다! 학교에서 배운 몇몇 분들의 성함만! 얼굴만! 알 뿐입니다! 그리고 여기에 계신 선생님 들 이외에 얼마나 많은 분들께서 빼앗긴 나라를 되찾기 위해 #독립운동 을 하시다 목숨을 희생 하셨을까요! 이제라도 고이 간직하고 기억 하겠습니다
@PUBATTLEGROUNDS Hello, I'm a Xbox user in South Korea. If i pre-order PUBG at https://t.co/PgB5mg7fQc, can i play the game in Korean Language on Xbox one S?
[이재정의원실] 세월호 참사 미수습자 유가족들이 끝내 너무나 어렵고 마음아픈 결정을 내렸습니다. 더 이상 수색을 요구하지 않겠다는 것입니다. 하지만 선체조사와 2기 특조위도 남아있습니다. 모든 미수습자들이 가족의 품으로 돌아올 그 날을 함께 기다립니다. https://t.co/7sqvbGB7V6