MIT’s “Introduction to Algorithms,” published #otd in 1990, is the world’s most cited CS text, with 67K citations & over a million copies sold.
https://t.co/Xy5ve7l803
@mitpress
@drummatick i really appreciate the sense of respect and acknowledgement that you have for the pioneers, how you commemorate them at a personal level, something we all should learn and instill in ourselves.
the usage of the word bottleneck has been soaring for the past few weeks, maybe the world owes this to yc video, even our profs are using it 👀 i personally only used it for the ford fulkerson algorithms. strange how wor(l)d works.
I need to understand how LLMs are scaled, so I am reading about matrix multiplication and not scaling techniques that involve scaling servers up or down.
Why? Why waste time in matrix multiplication for scaling?
LLMs are just matrix multiplication. If I can't understand where the bottleneck lies, what are the ways to handle those bottlenecks then the "how" will never answer the "why".