Excited to FINALLY release toughest+most rewarding paper I've worked on...
….we attack a 150 year old Walras question that's gone unanswered, not for lack of trying (Hicks, Samuelson, Arrow; our chances?😱)...
Q: Is the market equilibrium stable or unstable?¯\_(ツ)_/¯
����
A few complexity theory pedagogy takes:
1. People should learn about the Time Hierarchy Theorems before learning about Turing/Godel. I've seen so many autodidacts reinvent the THTs and become convinced that they've disproved Godel when they actually just learned something closer to the truth and misunderstood Godel.
2. People should learn about quantifier logic where NP is just a special case of the Polynomial Hierarchy. Shut up about all the stupid "NP-hard" stuff for problems that don't fit cleanly into NP but fit perfectly well into something one step above in the PH that you haven't bothered to teach.
3. Lattices and Galois connections deserve a lot more attention. Abstract Interpretation as a cheap approximation of a SAT solver is a really nice perspective, and a good way to study where and how different kinds of approximations break down.
4. Stop calling NP problems "intractable" when your SAT solver spits out a solution to a million-variable problem in 20 minutes.
Local minima are rare in high dimensions because a strict local minimum has to curve upward in every direction, so all Hessian eigenvalues must be positive.
In a D-dimensional toy model where eigenvalue signs are independent, that’s a 2^(-D) event. In GOE-like random matrix models, positive definiteness is even rarer, roughly exp(-cD^2).
So as dimension grows, random critical points are much more likely to be saddles than minima. This is one reason high-dimensional optimization is often a saddle-escape problem, not a bad-local-minimum problem.
Wrote up some of the math here: https://t.co/vkaVqVD64N
Sharing some learning from attending MLSys '26. There were a lot of interesting papers presented in distributed training and inference. Overall, I could capture the following themes:
1. Distributed training has a lot of knobs, which are really tough to manage and tune. Ton of work is being done to make it easy to manage this.
2. As training gets larger, reliability matters more. It was not surprising to see many industry talks focus on training reliability.
3. Ultra-long context lengths are getting a huge mindshare for both training and inference.
4. Heterogeneous compute (multi-region, multi-accelerator) is on the rise and is probably the next frontier of inference optimization.
5. Distributed inference still needs better auto-tuning for finding the best configs at large scale.
6. KV cache optimization, attention optimization, and quantization were already on the radar, so the number of papers on these topics was not a surprise.
7. IMO, from a skills perspective, the best thing to learn is GPU communication and networking. Learn everything around inter and intra-rack communication, NCCL, and UCCL. Lots of improvements in the coming years will come from optimizing communication between GPUs via better kernels and frameworks.
For a list of interesting papers and their summaries: https://t.co/IY5p8PSeVQ
Oded Goldreich posted a very insightful digest of the interactive proofs of proximity paper by Rothblum, Vadhan, and Wigderson. This is one of my favourite TCS papers, and Oded's new exposition makes it even easier to appreciate its beauty and elegance.
https://t.co/2d5GNNKc3f
It has been more than 6 months (on and off) that I am trying to get upto speed with GPU/TPU kernel development.
IMHO, profiling should be the starting point of learning this topic. You profile, you question, you look for answers and in the process read and imbibe.
I set out on a journey to do just the same. I began profiling gemma4 and was quickly humbled by the amount of information that was at my disposal. The profiler table with huge GEMM names, the profiler trace with too many CPU rows.
To make my life easier, I stepped back and profiled a basic matrix multiplication and addition operation, the weights and bias interaction, as one might see it. The profiler artifacts were simple enough to reason and think through.
In this blog post, I document my journey and in the process uncover how one should profile and what one should look at! I hope this helps beginners (like me) with a starting point of their kernel development and optimization journey.
PS: This is a big blog post, bookmark it and come back to this when you have the time (good weekend read?)
AJR showed that it is possible to win a Nobel Prize using Mickey Mouse Numbers. In my new blog post, I show how their results depend on choice made in data construction, in both "The Colonial Origins of Comparative Development" (2001) and "Reversal of Fortune" (2002). 1/9
I bet not a lot of people today know this story about the big fight at the IPCC in the 90s between economists and politicians about the economic assumptions underlying climate damage estimates (the economists were wrong)
"Matrix Calculus for Machine Learning and Beyond" is an interesting set of free lecture notes for understanding the mathematics behind modern deep learning. It covers gradients, Jacobians, Hessians, matrix-valued functions, backpropagation, optimisation, and many of the mathematical structures used in machine learning and AI models.
One interesting aspect is that the material maintains a strong university-level rigour while remaining highly visual: the notes include numerous diagrams, graphs, geometric interpretations, and intuitive explanations of matrix calculus applied to neural networks.
It is a valuable resource not only for students studying machine learning, but also for anyone who wants to build a solid foundation in computational linear algebra and optimisation.
https://t.co/XsnReRHu35
This is a massive story. It'll go over heads of many, but represents a major structural change in China's agrarian transition & major event in global agrarian history. Till now limitation in the Chinese economy has been it's structural dependence on semi-proletarian labour (1/3)
we recently optimized qwen3.5-397b-a17b to be the fastest deployment publicly hosted.
and the crazy thing: we did it by writing CUSTOM KERNELS for AMD MI355x. ���
see our post below outlining how we optimized kernels to achieve SOTA performance.
Having spent the past few weeks in Beijing giving talks and attending meetings, here are some quick observations as I wait for my flight to NYC to board:
1. The talk of the town has, of course, been the Xi-Trump meeting, but no one (not even usually well informed elite circle insiders) seems to know what it actually accomplished, other than a continuation of the detente that’s been in place for the past several months. That’s about as good an outcome as one could realistically expect, I suppose, but clearly a real “grand bargain” is not in the cards anytime soon.
2. The Chinese economy seems to be in a steady state, neither improving much nor visibly deteriorating like it was in 24-25. In that sense the government’s stimulus policies have had a positive effect, but the vast majority of industry people I talked to remain very pessimistic about domestic profits and consumption. The dominant sentiment is that the only way for major firms to generate profit growth is through direct overseas expansion.
3. That said, technological advancement is of course very real and quite impressive (although it’s not quite as visible in Beijing as it is in, say, Shenzhen). One interesting and very pleasant side effect of the EV revolution (paired with infrastructure investment) has been that Beijing is now a bike-able city again, given the sharp reduction in exhaust fumes on city streets and the expansion of bike lanes. Armed with a new bike, I could almost explore the city like I used to back in 2000. Hugely nostalgic feeling.
4. Academia is, in general, in a pretty dour mood. STEM subjects and the social sciences/humanities alike have seen very significant funding reductions over the past 2 years, but the latter have of course gotten the worst end of the deal. Political censorship also seems to be visibly ramping up again, with the sheer scale of perceived “red lines” snowballing to levels unprecedented since the early 1990s. As the recent Yang Nianqun incident suggests, administrative regulation of faculty members’ personal affairs has also expanded (i.e., consensual extramarital relationships between adults who were not in a direct teacher-student relationship would almost certainly have gone unpunished as recently as 5 years ago).
5. In general, it’s hard not to notice the steady increase in government presence in everyday life—in both positive and negative ways. The city feels safer and cleaner than it ever has been, and yet the layers of administrative review needed for just about any kind of professional activity have clearly proliferated on a vast scale (made less painful by the digitization of most government services and more uniform law abidance, but still more onerous than it used to be despite all that).
6. The most alarming thing, I suppose, is that general optimism (personal or socioeconomic) seems to be in particularly short supply among the younger generations. This is obvious even among the most intellectually gifted kids at Tsinghua and PKU, where the level of career anxiety seems to be at a level that I have never encountered before. Unsurprisingly, willingness to form families or plan ahead in general at the personal level is very low.
All in all, it was, as always, a very informative couple of weeks. The stay was also made much more pleasant by the fact that I managed to do it before Beijing becomes brutally hot. I look forward to being back more often in the near future.
A somewhat overdue second edition of our Group Testing monograph w/ @mpaldridge@BristOliver (several major results came shortly after the 2019 first edition..!):
(arXiv) https://t.co/lGSyQBWaQt
(publisher) https://t.co/ia0h1hOFtN