“According to Harsha Bhogle, Virat Kohli is ‘adapting’ with 540+ runs, while 283 runs is somehow a ‘great impact season." If Kohli had scored 283 runs at this strike rate, these commentators would’ve declared his T20 career finished.
Terence Tao says the math behind today’s LLMs is actually simple. Training and running them mostly uses linear algebra, matrix multiplication, and a bit of calculus, material an undergraduate can handle. We understand how to build and operate these models.
The real mystery is why they work so well on some tasks and fail on others, and why we cannot predict that in advance. We lack good rules for forecasting performance across tasks, so progress is largely empirical.
A key reason is the nature of real-world data. Pure noise is well understood, perfectly structured data is well understood, but natural text sits in between, partly structured and partly random. Mathematics for that middle regime is thin, similar to how physics struggles at meso-scales between atoms and continua.
Because of this gap, we can describe the mechanisms but cannot yet explain capability jumps or give reliable task-level predictions. That mismatch, simple machinery versus hard-to-predict behavior, is the core puzzle.
----
Video from 'Dr Brian Keating' YT Channel (Link in comment)
This GitHub incident is insane. Merge queue commits have been reverting previously merged commits at random.
This not only breaks the mental contract teams have with Git in general, but is subtle enough to be really hard to unravel after the fact.
https://t.co/C8vbDGqdmC
@Pradhyoth1 They can completely fill the staidum hut it is so huge that it actually doesn’t feel that electric. Can’t compete with Chinnaswamy or Vankhede.